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FOREWORD 


The  21st  Conference  on  the  Doslgn  of  Experiments  In  Army  Research, 
Development  and  Testing  was  held  22-24  October  1975  In  Washington,  DC. 

The  Conference,  which  took  place  at  the  Walter  Reed  Medical  Complex,  had 
two  hosts:  the  Walter  Reed  Army  Medical  center  and  the  Armed  Forces 
Institute  of  Pathology.  Both  hosts  furnished  excellent  conference  rooms 
and  meeting  rooms  for  this  symposium.  Planning  for  these  meetings  re- 
quires painstaking  attention  to  detail  and  we  are  Indebted  to  Dr.  Walter 
D.  Foster  and  Dr.  James  N.  Young;  both  of  the  Armed  Forces  Institute  of 
Pathology,  for  serving  well  as  Chairmen  for  Local  Arrangments.  We  are 
pleased  that  Major  General  Robert  Bernstein,  Commander  of  the  Walter  Reed 
Army  Medical  Center,  opened  the  Conference  and  welcomed  us.  This  Is  not 
the  first  meeting  tc  be  held  at  the  Walter  Reed  Installation.  On  each 
occasion,  the  reception  given  us  has  been  excellent,  and  we  look  forward 
to  meetings  there  again  in  the  future, 

There  were  four  addresses  by  Invited  speakers.  Traditionally  an 
attempt  Is  made  by  the  Program  Committee  to  have  expository  talks  on 
themes  3omewhat  pertinent  to  the  mission  of  the  Army  installation  at 
which  the  annual  conference  Is  held.  Success  along  these  lines  was 
achieved  again.  The  first  address  was  given  by  Frederick  Mosteller  of 
Harvard  University,  who  spoke  on  "Success  In  Social  and  Medical  Experi- 
mentation." Dr.  Mosteller  was  given,  at  his  request,  two  hours  to  de- 
liver his  address.  Normally,  there  would  have  been  five  invited  addresses, 
but  the  length  of  Professor  Mosteller 's  talk  led  to  four  at  this  meeting. 
Dr.  Mosteller's  talk  was  given  at  the  first  morning  of  the  Conference 
and  was  followed  In  the  late  afternoon  by  two  papers  on  clinical  trials. 
There  has  been  much  in  the  medical  and  statistical  literature  on  this 
topic.  Professor  Ecknund  A.  Gehan  of  the  University  of  Texas  System 
Cancer  Center  spoke  un  "Non-randomised  Clinical  Trials"  and  Professor 
Paul  Meier  of  the  University  of  Chicago  addressed  the  audience  on 
"Randomized  Clinical  Trials."  On  the  second  day  of  the  Conference, 
Professor  Seymour  Geisser  of  the  University  of  Minnesota  gave  an  In- 
vited address  on  "Predictive  Sample  Reuse."  This  was  followed  on  the 
morning  of  the  last  day  of  the  Conference  by  a talk  on  "Normality  and 
Disease"  given  by  Professor  Edmond  A.  Murphy  of  the  Johns  Hopkins 
Medical  School. 

One  major  purpose  of  the  Conference  is  to  bring  together  those 
engaged  in  scientific  work  In  Army  installations  with  investigators 
from  other  government  agencies  and  those  from  university  life.  This 
Interaction  has  been  going  on  successfully  since  the  Inception  of  the 
program.  Statisticians  and  others  in  Army  installations  discuss  their 
work  at  technical  sessions  and  clinical  sessions  at  each  annual  con- 
ference. For  this  Conference  there  were  seven  technical  sessions  com- 
prising 24  papers  and  four  clinical  sessions.  At  the  clinical  sessions 
a panel  of  experts  responds  to  problems  raised  by  those  in  Army  instal- 
lations who  have  usually  given  advance  manuscript  copies  to  the  panelists. 


Hi 


Besides  the  technical  aspects , these  sessions  provide  a source  for 
Initiating  future  collaboration  between  scientists  In  Army  Installa- 
tions and  those  In  university  life. 

At  the  start  of  this  year's  opening  session.  Or.  Halter  D.  Foster 
was  honored  with  a Certificate  for  Achievement  for  the  valuable  con- 
tributions he  made  during  his  twelve  yoars  as  Chairman  of  the  Probability 
and  Statistics  Subcomml ttee  of  the  Army  Mathematics  Steering  Comnlttee. 
t!a  was  specifically  cited  for  "continuously  and  vigorously  crusading 
for  application  of  sound  statistical  principles  and  methodology  to 
problems  In  Army  research  and  development." 

On  the  evening  of  the  first  day  of  the  Conference,  a banquet  Is 
held  at  which  the  Samuel  S.  Wilks  Memorial  Award  of  the  American 
Statistical  Association  and  the  Department  of  the  Army  Is  presented. 

At  this  meeting  the  11th  award  was  presented  by  Lester  Frankel , Presi- 
dent of  the  ASA,  to  Dr.  Herbert  Solomon,  Professor  of  Statistics,  Stan- 
ford University.  The  award  was  made  to  Dr.  Solomon  for  his  significant 
contributions  to  statistical  methodology  and  for  his  outstanding  contri- 
butions In  the  application  of  statistics  In  the  service  of  the  nation. 

The  Army  Mathematics  Steering  Committee  sponsors  these  meetings  on 
behalf  of  the  Office  of  the  Chief  of  Research  and  Development  and  Ac- 
quisition to  bring  new  developments  In  statistics  to  Army  scientists 
and  engineers  and  to  expose  them  to  thinking  that  could  be  profitable 
to  them  In  the  execution  of  their  missions.  The  Committee  has  asked 
that  the  proceedings  of  the  Conference  be  published  and  Issued  Army- 
wide and  to  other  scientific  communities. 

At  the  beginning  of  each  calendar  year  the  program  committee  for 
these  conferences  Is  selected  and  meets  In  Washington,  DC,  to  suggest 
areas  of  Interest,  to  outline  a program,  and  to  suggest  speakers  for 
the  meeting  to  be  held  later  that  year.  I would  like  to  express  my 
appreciation  to  Dr.  Frank  Grubbs,  Program  Chairman  for  this  year's 
committee,  and  to  Dr.  Douglas  Tang,  Chairman  of  the  Subcommittee  on 
Probability  and  Statistics,  Army  Mathematics  Steering  Committee,  for 
their  efforts  and  great  help.  My  thanks  also  go  to  other  committee 
members  Involved  In  developing  this  year's  program:  Drs.  David  W. 

Ailing,  Gary  A.  Chase,  Walter  D.  Foster,  Bernard  Harris,  J.  Stuart 
Hunter,  Clifford  J.  Maloney,  Badri g Kurkjlan,  Marvin  Schnelderman. 

Francis  Dressel,  as  always,  was  helpful  In  many  ways  In  making  sure 
the  program  was  a success.  Thus  many  hands  helped  In  guiding  this 
Conference  to  a successful  conclusion,  and  this  Is  very  much  appre- 
ciated. 


Herbert  Solomon 
Conference  Chairman 
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Abstract.  The  interface  between  5.56mm  ball  and  tracer  bullet 
designs  and  various  rifling  configurations  are  examined  to 
determine  the  effects  on  ballistic  performance  and  mechanical 
i integrity  as  would  be  experienced  under  general  purpose 
j machine  gun  operational  modes. 

; Two  modes  of  projectile  failure  are  examined  against 

light  machine-gun  system  design  criteria.  Based  on  these 
| results,  optimum  rifling  configurations  are  identified  for 
< use  in  a machine-gun  system. 

i 

j Verification  of  these  optimized  rifling  designs  through 

experimentation  are  discussed. 

j 

1.  Introduction.  Initial  interest  in  the  study  of  those 
parameters  errecting  barrel/bullet  interface  was  generated 
! at  Frankford  Arsenal  under  the  6mm  tracer  program.  At  that 
i time,  the  6mm  ball  and  tracer  cartridges  were  the  prime 

i ammunition  candidates  for  the  Squad  Automatic  Weapon  (SAW) , 

and  consequently  great  concern  was  expressed  at  a high 
incidence  of  tracer  projectile  failures  (break-up)  then 
being  observed  during  both  test  barrel  and  weapon  barrel 
performance  tests. 

; Table  1 categorizes  various  tracer  projectile  malfunctions 

from  four  and  six-groove,  plated  and  unplated  weapon  and  test 
barrels.  This  chart  shows  the  frequency  of  projectile  failures 
1 from  four-groove  plated  weapon  barrels  and  to  a lesser  degree 
in  four-groove  plated  test  barrels. 

As  a result  of  this  high  incidence  of  projectile  failure, 

\ an  analytic  stress  study  was  undertaken  to  examine  certain 
i modes  of  failure  which  could  explain  the  type  of  projectile 
break-up  being  exhibited. 
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2.  Stress  Evaluation.  Th  typical  6mm  tracer  failure  as 
observed  in  recovered  projectiles  was  evidenced  by  a radial 
flaring  of  the  projectile  base  and  longitudinal  separation 
of  the  projectile  jacket,  as  if  the  pyrotechnic  column 
exploded  after  muzzle  exit. 

The  modes  of  projectile  failure  examined  in  the  initial 
stress  study  were: 

a.  The  shear  deformation  or  out-of-roundness  occurring 
in  the  projectile  jacket. 

b.  The  stress  field  encountered  by  the  projectile 

jacket  after  engraving  and  during  acceleration  of  the  projectile, 

Shortly  after  the  initiation  of  the  stress  study,  DA 
guidance  was  received  eliminating  the  6mm  concept  from 
inclusion  as  a SAW  contender.  Developmental  efforts  were 
redirected  towards  the  consideration  of  a 5. 56mm  SW 
ammunition  contender,  which  was  easily  included  in  the  analytic 
study.  Shown  in  Table  2 are  the  pertinent  projectile 
characteristics  for  the  5.56mm  concepts  under  development. 

In  selecting  an  ammunition  design  as  a SAW  contender,  several 
design  criteria  were  applied  to  the  analysis  in  order  to 
define  the  use  of  the  projectile  and  weapon  barrel  in  a light 
machine-gun  role.  These  design  criteria  are  outlined  in 
Table  3.  In  addition  to  these  design  parameters  addressing 
projectile  integrity,  any  interior  bore  configuration  must 
satisfy  other  basic  performance  requirements  such  as  projectile 
accuracy,  barrel  life  under  machine-gun  firing  schedules, 
interior  ballistics,  terminal  effectiveness  and  high  rate 
manufacture  by  current  methods. 

The  effect  of  shear  deformation  on  the  projectile  integrity 
was  considered  by  applying  thin-ring  theory  to  the  projectile 
jacket  with  Hn"  distributed  forces  being  applied  corresponding 
to  the  number  of  lands.  The  results  of  the  analysis  indicated 
that  during  the  engraving  process  it  is  desirous  that  the 
pressure  under  the  land  be  as  large  as  possible  for  any  given 
deflection.  The  reason  for  this  is  that  the  engraving  is 
caused  by  the  jacket  material  becoming  plastic,  and  the  smaller 
the  deflection  that  is  encountered  when  the  material  goes 
plastic,  then  the  less  out-of-roundness  that  will  be  incurred 
by  the  jacket.  When  considering  this  result  relative  to  the 
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pressures  and  deflections  induced  by  four  and  six-groove 
barrels,  the  results  clearly  indicate  that  the  six-groove 
configuration  is  clearly  superior  to  the  four-groove  even 
when  comparing  a six-groove  barrel  with  minimum  land  height 
to  a four-groove  with  a maximum  land  height. 

The  stress  field  developed  on  the  jacket  after  engraving 
and  during  acceleration  was  addressed  by  considering  a 
pressure  gradient  acting  from  the  bottom  to  the  top  of  the 
engraved  surface.  By  relating  this  pressure  distribution 
to  the  depth  of  engraving,  minimum  values  of  engraving  depth 
were  calculated  such  that  the  probability  of  jacket  shearing 
is  reduced.  This  minimum  depth  of  engraving  was  shown  to 
be  .0017  in.  for  the  four-groove  barrel  and  .0011  in. for  the 
six-groove.  These  minimum  engraving  depths  were  applied  to 
the  analysis  in  determing  optimum  bore  configurations. 

Optimum  Bore  Dimensions  and  Projectile  Compatibility,  when 
considering  the  minimum  engraving  depths  required  together 
with  the  pertinent  design  criteria  and  projectile  dimensions, 
it  is  possible  to  compute  optimum  rifling  dimensions  such  that 
the  types  of  system  failures  considered  will  be  minimised. 

This  was  done  for  the  projectiles  being  developed  by  relating 
the  minimum  engraving  depths  required  such  that  jacket  shear 
does  not  take  place  as  a function  of  projectile  diameter, 
bore  diameter,  barrel  temperature,  jacket  deformation  due  to 
engraving  and  land  wear.  This  relationship  is  shown  in 
equation  1-1. 


(1-1)  le  - Rp  - Rbo  (1  + aAT  ) - Wfa  - uLy 

where,  la  * minimum  engraving  depth  required 

Rbo  ■ bore  radius  or  land  radius 

Rp  - projectile  radius 

a - coefficient  of  thermal  expansion 

AT  * barrel  temperature  gradient  under  hot  condition 
Wjj  - barrel  wear 

uLy  " jacket  displacement  before  yielding 

By  solving  equation  1-1  for  Rbo,  the  land  diameter  suited 
to  each  projectile  design  can  be  found.  The  optimum  groove 
size  was  derived  such  that  the  smallest  projectile  diameter 
used  in  the  bore  will  have  the  same  diameter  as  the  groove 
at  its  highest  temperature  as  shown  in  equation  1-2.  This 
would  correspond  to  the  barrel  temperature  reached  under 
sustained  firing  schedules. 
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(1-2) 
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groove  diameter 

minimum  projectile 
diameter 

coefficient  of 
thermal  expansion 

barrel  temperature 
gradient 


The  optimum  barrel  dimensions  calculated  using  equations 
1-1  and  1-2  are  shown  in  Table  4.  Note  that  configurations 
1 and  2 are  optimum  based  on  tracer  projectiles  of  differing 
diameters  while  configuration  3 considers  an  increased  land 
height  for  larger  barrel  wear  over  configurations  1 and  2. 

Standard  5.56mm  barrel  dimensions  are  shown  as  reference. 

A numerical  exercise  was  performed  utilizing  the  optimum 
rifling  dimensions  and  projectile  dimensions  to  demonstrate 
the  range  of  in-bore  interferences  and  clearances  possible 
under  "best"  and  "worst"  design  conditions.  Table  5 summarizes 
the  results  of  this  exercise  giving  a range  of  interference/ 
clearance  values  for  both  standard  5.56mm  bore  configuration  and 
optimized  configurations.  To  properly  compute  these  interference/ 
clearance  values , the  following  parameters  were  considered: 

a.  minimum  and  maximum  bullet  diameters  (ball  and  tracer) 

b.  minimum  and  maximum  land  and  groove  diameters 

c.  .0005  in.  diametrical  land  wear  0 

d.  diametrical  bore  expansion  at  1250  F 

Table  6 lists  the  equations  used  to  compute  the  ranges 
of  interference/clearance  and  minimum  land  height  values. 

In  comparing  the  standard  barrel  designs  with  the  optimized 
cases,  it  is  important  to  view  these  results  in  a strictly 
statistical  sense  in  that  projectile  deformation  into  the 
barrel  grooves  was  not  considered.  However,  despite  the 
rather  static  condition  under  which  these  numbers  were 
generated,  a major  difference  among  designs  can  be  noted. 

In  all  cases,  the  optimized  designs  exhibit  a greater 
projectile/barrel  interference,  or  lesser  projectile/barrel 
clearance  than  the  standard  barrel  dimensions.  This  important 
difference  is  the  direct  result  of  attempting  to  accommodate 
differing  ball  and  tracer  projectile  diameters  while  insuring 
satisfactory  system  performance  over  a temperature  range  from 
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TABLE  4 


» 56MM  (SAW)  AMMPNITION/WEAPON  INTERFACE 


ambient  to  1250  F.  These  design  parameters  are  further 
aggravated  by  considering  land  wear. 

Comparing  the  interferences  and  clearances  shown  in 
Table  5 with  the  minimum  required  land  engagement  of  .0011 
in.  for  six-groove  configurations  shows  possible  problem 
areas.  Despite  the  fact  that  the  minimum  land  heights  under 
worst  conditions  exceed  this  .0011  in.  requirement,  it  is 
not  necessarily  true  that  proper  engraving  will  occur.  This 
situation  occurs  in  the  5.56mm  standard  six-groove  design, 
for  both  ball  and  tracer  comparisons.  Although  the  minimum 
land  height  at  1250°F  is  adequate  for  the  required  .0011  in. 
engraving,  this  engraving  cannot  occur  if  the  projectile/ 
land  interferences  run  as  low  as  .0005  in.,  as  it  does  for 
the  tracer.  This  minimal  interference  could  lead  to  a 
serious  skidding  problem. 

Experimental  Evaluation.  The  accuracy  of  the  analysis, 
as  well  as  the  suit/oility  of  any  barrel  design  to  field  use, 
can  only  be  verified  through  extensive  testing.  Toward 
this  end,  a quantity  of  barrels  of  various  configurations 
has  been  procured  for  evaluation  of  system  performance 
levels.  Table  7 is  a matrix  showing  the  quantity  and  types 
of  barrels  which  will  be  the  core  of  an  exhaustive  barrel 
performance  program.  These  barrels  will  be  tested  along 
with  approximately  45,000  rounds  of  5.56mm  ball  and  tracer 
ammunition  against  current  SAW  performance  requirements 
so  that  sufficient  statistical  significance  is  obtained, 
pointing  to  a singular  rifling  configuration. 

Plans  for  testing  currently  envision  adhering  to  current 
acceptance  standards  for  5.56mm  and  7.62mm  ammunition  and 
will  mirror  sample  sizes  of  barrels  and  ammunition  contained 
therein. 


TABLE  7 


5.56MM  (SAW)  AMMUNITION/WEAPON  INTERFACE 
BARREL  MATRIX 


N.  BARREL 

X.  type 

BORE  X. 

ACCURACY 

PRESSURE 

WEAPON* 

(CHROMED) 

WEAPON* 

(UNCHROMI 

CONFIGURATIONS^ 

QUANTITY 

i 

i 

STANDARD  5.56MM 
RIFLING 

2 

2 

3 

2 

6-GROOVE  BORE 
I IN  12  TWIST 
UNDERSIZED  TRACER 
(CONFIG.  1) 

2 

2 

3 

2 

6-GROOVE  BORE 
1 IN  11  TWIST 
UNDERSIZED  TRACER 
(CONFIG.  1) 

2 

2 

3 

2 

6-GROOVE  BORE 
1 IN  12  TWIST 
BALL  SIZE  TRACER 
(CONFIG.  2) 

2 

2 

3 

2 

6 -GROOVE  BORE 
1 IN  11  TWIST 
BALL  SIZE  TRACER 
(CONFIG.  2) 

2 

2 

3 

2 

6-GROOVE  BORE 
1 IN  11  TWIST 
INCREASED  LAND  HEIGHT 
FOR  ECCENTRICITY 
(CONFIG.  3) 

2 

2 

3 

2 
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DESIGN  OF  EXPERIMENTS  DEALING  WITH  MAN -MACHINE  INTERFACE 
IN  CURRENT  COMMUNICATIONS  SYSTEMS 

R.  J.  D'Accardl  and  H.  S.  Bennett,  U.S.  Electronics  Command, 

Fort  Monmouth,  New  Jersey 

J.  R.  Hennessy,  U.S.  Army  MERDC,  Fore  Belvolr,  Virginia 

ABSTRACT.  Recently,  the  US  Army  Electronics  Comnand  has  supported  experiments 
dealing  with  man-machine  Interface  problems  occurring  In  Tactical  Communications 
Systems.  The  aim  was  to  characterize  communications  system  operators'  per- 
formance under  various  environmental  conditions  related  to  tactical  operations. 
The  study  was  directed  towards  system  equipment  such  as  the  standard  teletype 
and  optical -read-only  terminal  equipments.  Using  these  devices,  the  signifi- 
cance of  acoustic  noise  and  ambient  light  on  operator  performance  was  studied 
under  sixteen  combinations  of  environmental  conditions. 

The  object  of  this  presentation  Is  threefold.  First,  we  discuss  the  methods 
of  evaluating  message  transfer  over  man-machine  Interfaces  to  Include  audio 
and  visual.  Second,  we  discuss  the  design  of  the  experiment  and  modeling  to 
determine  the  operator  characteristics  under  different  environmental  conditions, 
and  third,  we  present  statistical  estimates  of:  (a)  the  effects  of  the 
controlled  variables  (ambient  light  and  acoustic  noise)  upon  the  transcription 
accuracy  of  several  operators,  (b)  measures  of  experimental  error  to  define 
a range  of  values,  for  a prescribed  level  of  confidence,  within  which  the 
true  value  of  the  estimates  may  be  found,  and  (c)  the  most  significant 
combinations  of  environmental  effects  on  operator  performance.  Several  multi- 
variate regression  models  which  characterize  operator  performance  are 
presented  and  the  criteria  for  chcns Ing  the  best  model  are  discussed. 

INTRODUCTION.  Information  gained  In  evaluating  and  solving  man-machine 
Interface  problems  that  occur  In  complex  communications  systems  Is  extremely 
Important  to  systems  engineers  committed  to  the  mission  of  the  design  and 
fabrication  of  future  generations  of  equipment.  Sophisticated  systems  of 
Command  and  Control,  computer-aided  man-ln-the-loop  systems  (e.g.,  manned 
space  craft),  human  response  to  audio  and  visual  displays,  management  functions, 
pattern  recognition,  man-computer  languages,  cutaneous  communication  and  many 
other  facets  are  of  concern  where  an  operator  must  perform  a control  task,  or 
decision  task.  At  present  there  Is  a large  volume  of  on-going  work  oriented 
towards  man-machine  Interfaces  which  span  the  projected  needs  of  the  Armed 
Forces.  For  example,  work  In  progress  by  the  Naval  Electronics  Systems 
Command,  6570th  Aerospace  Medical  Research  Laboratory,  DA  ARI  for  the 
Behavioral  Sciences,  ECOM  and  HEL  (to  name  a few)  generally  deal  with  evalu- 
ation of  complex  system  Interfaces,  assessment  of  operator  performance 
capabilities  for  a wide  variety  of  tasks,  analysis  of  manual  functions  Into 
tasks,  analysis  of  human  control  functions,  and  the  physical  and  psychological 
characteristics  which  affect  the  assessment  of  operator  performance  capa- 
bilities. Much  of  the  on-going  work  concerns  the  psychological  and 
physiological  aspects  of  comnand  and  control  In  tactical  operations,  weapons 
systems,  vehicles  management,  logistics,  and  communications.  Some  of  the  more 
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specific  areas  of  Investigation  are: 

1.  Work/rest  schedules  and  effects  on  man-machine  performance. 

2.  Utilization  of  Bio-electric  phenomena  to  automatically  control 
complex  systems. 

3.  Measures  of  operator  performance  under  different  mixes  of  equipment, 
personnel  and  procedures. 

4.  Physiological  aspects  (fatloue,  alertness,  metabolism,  endocrine 
gland  functions,  and  central  nervous  system)  of  operator  efficiency 
and  man-machine  Interface. 

5.  System  simulation  to  study  the  Impact  of  operator  performance  on 
complex  systems  as  a function  of  environmental  threat,  mission,  and 
work  load  stress. 

6.  Army  Tactical  Flight  operations  under  adverse  visibility  conditions. 

7.  Influence  of  USAF  operational  environments  on  air  crew  utilization. 

Examination  of  ongoing  research  In  these  areas  Indicate  that  there  Is 
no  clear  cut  procedure  to  evaluate  the  human  subsystem  In  a sophisticated 
communications  system  or  the  effects  of  environmental  stress  on  operator 
performance.  Army  communications  requirements  In  a tactical  situation  often 
require  24  hour  operations  and  personnel  are  required  to  work  either  on 
standard  or  unpatterned  and  frequently  extended  duty  schedules,  In  a variety 
of  environments,  each  characterized  by  multiple  stresses  occurring  In  a 
random  manner.  For  example,  the  accuracy  In  reading  an  optical  display  Is 
dependent  on  many  variables  such  as  number  of  lines,  characters,  ambient 
lighting,  environmental  noise,  speed  of  display,  correction  time,  back-log, 
operator  physiology  (e.g.,  mood,  fatigue,  attention,  and  training),  display 
brightness  and  size,  and  effective  slgnal-to-nolse  ratio  (legibility)  to 
Name  a few.  Since  future  Army  requirements  Include  optical  display  terminals. 
It  Is  essential  to  provide  Insight  Into  those  variables  that  affect  accuracy 
through  the  man-machine  Interface  and  the  effects  caused  by  physiological 
factors.  To  answer  the  Army's  need  for  measures  of  man-machine  Interfaces 
which  occur  In  communications  systems  and  to  enhance  the  design  of  future 
families  of  equipment,  this  report  will  address  teletype  operator  per- 
formance as  the  environmental  factors  of  ambient  light  and  acoustic  noise 
are  varied.  The  design  of  the  experiment  performed  at  Ft.  Monmouth,  New 
Jersey  during  April  and  May  1975  and  results  are  discussed.  Experimental 
results  and  several  models  are  presented  which  show  the  significance  of 
these  variables  on  experienced  teletype  operators. 


jl 

DESIGN  OF  THE  EXPERIMENT.  The  significance  of  acoustic  noise  and  ambient  :i 

llgnt  on  operator  performance  was  Investigated  using  a visual  display  , 

transmission  device,  see  figure  1.  This  Is  a visual  terminal  designed  to 
Interface  with  computers  or  store-and-forward  devices.  Primarily,  It  Is 
a developmental  equipment  Intended  to  visually  present  messages  on  a CRT 
display  where  an  operator  can  see  and  correct  his  message  prior  to  transmission.  ,j 

The  advantages  of  this  equipment  over  the  standard  military  teletypewriter  ;? 

were  not  addressed  In  this  experiment. 

The  experiment  consisted  of  testing  the  transcription  accuracy  of  six 
experienced  commuricatlons-center  operators  under  16  combinations  of 
environmental  conditions.  Ambient  light  was  varied  at  four  levels,  ranging  * 

from  24  ft-candles  to  3 ft -candles,  and  acoustic  noise  was  concurrently  i 

varied  at  four  levels  ranging  from  55  dBa  to  95  dBa.  Sound  pressure  level 
(SPL)  measured  In  dBa  Is  In  reference  to  .0002  dynes/cnr.  This  Is  con- 
sidered the  threshhold  of  hearing  and  Is  roughly  equivalent  to  a leaf 
"falling"  on  a quiet  day.  The  55dBa  level  was  considered  the  quiet 
condition  where  only  the  Inherent  noise  from  the  terminal  equipment,  sound 
room  noise,  and  thermal  noise  were  recorded.  The  9$dGl4  level  represented 
an  extremely  annoying  and  distracting  "pink"  noise.  The  noise-power  per  1 

unit  frequency  for  this  type  of  noise  Is  Inversely  proportioned  to  frequency  1 

over  a specified  range  and  slopes  down  at  3dB  per  octave  from  20Hz  to  20KHz. 

These  characteristics  are  more  common  to  conference  type  noise  where  the 

higher  and  lower  frequency  components  characterize  motor  and  equipment 

noises.  Pink  noise  was  also  used  because  It  has  relatively  constant  energy 

per  octave-bandwidth.  The  24  ft-candle  light  level  compared  favorably  to  the 

Army  Corps  of  Engineers  standard  for  office  lighting.  The  other  chosen  levels  1 

of  12,  6 and  3 ft-candles,  respectively,  represented  successively  deteriorating 

ambient  light  conditions.  Throughout  the  testing,  the  brightness  of  the  visual 

display  was  constant. 

For  each  test  the  operator  was  required  to  type  his  name,  treatment 
combination,  and  date  as  part  of  the  message,  see  figure  2.  The  messages  for 
the  experiment  consisted  of  forty  random-letter  word  groups  of  five 
characters  each.  They  were  derived  through  a random  number  generator  and  an 
alphanumeric  conversion.  No  message  was  a duplicate  nor  were  they  duplicated 
by  any  of  the  operators  on  either  terminal  equipment.  The  random  letter 
format  was  used  so  that  the  operator  could  not  Identify  or  recognize  message 
words  and  therefore  would  have  to  concentrate  on  the  given  formats  to  avoid 
making  transcription  errors.  The  aim  of  the  experiment  was  to  vary  the 
environmental  variables  and  to  observe  the  accuracy  and  speed  of  transcribing 
the  random  letter  formats  as  a function  of  these  variables.  The  response 
variable,  accuracy,  was  the  measure  of  transcription  errors  that  each  operator 
committed  per  message  format.  The  errors  considered  were  the  following: 

1.  transposition 

2.  missing  letter 

3.  extra  letter 
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4.  Incorrect  space 

5.  extra  line  feed 

6.  missing  word  groups 

7.  wrong  letter 

8.  line  out  of  sequence  (skipped  line  Inserted  after  detection) 


19.  word  group  out  of  sequence 

The  results  were  compared  to  an  acceptable  operator  norm*  l.e. , typing  a 
message  format  on  a standard  teletype  terminal  (see  figure  3)  under 
the  same  conditions.  Each  operator  was  tested  In  four  sessions,  each  session 
\ programmed  for  eight  random  environmental  combinations,  four  for  each 
5 terminal  equipment,  where  tests  were  alternated  between  the  optical  display 
[ and  the  standard  teletypewriter.  This  was  done  to  reduce  the  effects  of  learning. 

I A thirty  minute  familiarization  period  was  given  each  operator  prior  to  the 

} tests,  and  a standard  Instruction  sheet  was  distributed  during  this  period 
\ to  Insure  uniform  orientation  with  the  equipment  and  with  the  purpose  and 
f procedure  of  the  experiment. 

[ The  effect  of  any  environmental  combination  Is  considered  to  be  the  sum 

j of  three  effects,  namely,  those  of  sound,  light,  and  the  Interaction  of 

j light  and  sound.  To  adequately  analyze  these  effects,  a two-level  factorial 

l experiment  was  formulated  with  six  replications.  The  four  levels  of  acoustic 
noise  are  combined  with  the  four  levels  of  ambient  light  giving  4 x 4 or  sixteen 
j treatment  combinations.  For  a two-factor  factorial  experiment  with  n 

observations  per  cell,  run  as  a completely  randomized  design,  [l]  , [2]  , a 
! general  model  1$: 

j Y1jk  " V ♦ Ai  + Bj  + A^Bj  + ek(1j) 


where  Y Is  the  response  variable,  l.e.,  the  number  of  transcribed  errors,  and 
A and  B are  the  main  effects  of  light  and  sound,  AB  Is  their  interaction,  e Is 
the  experimental  error,  (l.e.,  the  extent  to  which  the  observed  data  and  the 
general  model  disagree)  and  their  respective  levels  are  1 ■ 1 ,2,3,4;  j - 1.2. 3,4, 
with  k « 1,2— —6  observations  per  cell.  The  Interaction  term  adjusts  for  the 
failure  of  either  one  of  the  main  effects  to  remain  constant  for  each  level 
of  the  other.  The  test  runs  were  randomized  as  shown  In  table  I.  This  was 
done  to  minimize  the  effects  of  training. 
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Figure  3 - Teletypewriter  Terminal 


TABLE  1 

TREATMENT  SCHEDULE  PER  OPERATOR 

Environmental  Treatment 
Combination? 


Optical  Teletype 

Session  Run  Display  Terminal  Terminal 


I 

1 

1,4 

3,1 

2 

4.3 

4,4 

3 

3,2 

2.2 

4 

2,1 

1,3 

II 

5 

3.1 

4,1 

6 

4,4 

1,2 

7 

2.2 

3,4 

8 

1,3 

2.3 

III 

9 

4.1 

2.4 

10 

1,2 

3,3 

11 

3,4 

1.1 

12 

2,3 

4.2 

IV 

13 

2.4 

1,4 

14 

3.3 

4,3 

15 

1,1 

3,2 

16 

4,2 

2,1 

(Treatment  ■ (Ambient  Light  Level,  Acoustic  Noise  Level) 


Ambient  Light 
Level  Value 

1 24  ft -candles 

2 12  ft-candles 

3 6 ft-candl es 

4 3 ft-candles 


Acoustic  Noise 
Level  Value 


2 

3 

4 


5!TdBa 
70  dBa 
80  dBa 
95  dBa 
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ANALYSIS:  The  following  ANOVA  tables  and  statistical  estimates  were  formu- 
Tateato  analyze  the  transcribed  errors  for  the  standard  teletype  terminal  and 
for  the  optical  display  terminal  (tables  II,  III,  IV  and  V): 

TABLE  II 

ANOVA  FOR  STANDARD  TELETYPE  TERMINAL 

Degrees  of 


Sum  or  Squares 


Mean  Square  Error  "F"  ratio 


Ambient  Light,  A^ 

55.94 

3 

18.65 

0.33  j 

Acoustic  Noise,  Bj 

99.70 

3 

33.23 

0.59 

Interaction,  A^Bj 

109.93 

9 

12.21 

0.22 

Error-  emu) 

4494.67 

80 

56.18 

TABLE  III 

ANOVA  FOR  THE  OPTICAL  DISPLAY  TERMINAL 

Degrees  of 


Sum  of  Squares 


Mean  Square  Error  11 F11  ratio 


Ambient  Light 

65.28 

Acoustic  Noise 

276.03 

Interaction 

55.18 

Error 

5437.50 

RA- 


TABLE IV 


STATISTICAL  ESTIMATES  OF  TRANSCRIBED  ERRORS 
FOR  THE  TELETYPE  TERMINAL 


rm 


l 

Ambient 

Acoustic  Noise  Level 

— ForTTl 1 

l 

Liaht  Level 

Statistic 

SSdBa 

70  dBa 

mmim 

95  dBa 

Sound  Level s 1 

i- 

y 

24  ft-candles 

T 

r.  " i 

3.0 

5.8 

5.8 

6.2 

5.7  :] 

Sy 

1.87 

3.96 

3.7 

6.42 

4.23  1 

SY 

0.84 

! .77 

1.66 

2.87 

0.95  I 

■ 

12  ft-candles 

Y 

2.2 

6.8  ’ 

6.8 

9.8 

V 

6.4 

K 

sY 

2.17 

2.59 

5.54 

8.47 

6.63  !4 

' d6 

■V' 

s? 

0.97 

1.16 

2.48 

3.79; 

i- 

6 ft-candles 

7 

5.0 

3.8 

5.0 

7.2 

5.25 

> 

Sy 

3.94 

2.59 

6.2 

4.6 

4.34 

4 

1.76 

1.16 

2.77 

2.06 

0.97 

£ 

u 

3 ft-candles 

r 

4.4 

4.0 

3.8 

4.2 

4.10  1 

> 

Sy 

3.36 

4.95 

3.03 

1.79 

* 

4 

1.50 

2.21  ■ 

1.36 

0.80 

°-7’ 

i 

1 - - 7 ' 

* 

Y 

Overall  i 

For  All  Light 

4.15 

5.10 

5.35 

6.85 

5.36 

Level s 

SY 

3.30 

3.60 

4.55 

5.76 

4.43 

r 

s? 

0.74 

0.80 

1.02 

1.29 

0.50  ■ 

• 

l 

i 

■ 4 

! 

TABLE  V 


STATISTICAL  ESTIMATES  OF  TRANSCRIBED  ERRORS  FOR  THE 
VISUAL  DISPLAY  TERMINAL 


Ambient 
Light  Level 

Statistic 

Acoustic  Noise  Level- 
55  dBa  70  dBa  80  dBa 

95  dBa 

“ToFTIl — T 

Sound  Levels 

24  ft-candles 

T 

3.4  ■ 

5.80 

6.20 

9.2 

6.1b 

Sy 

in 

4.76 

5.17 

4.82 

4.61 

s- 

i 

1.21 

2.13 

2.31 

2.1 5 

1.03 

12  ft-candles 

V 

6.8 

5.0 

7.0 

8.60 

6.9 

su 

3.77 

2.77 

2.45 

6.35 

3.99 

s! 

Y 

1.69 

1.24 

1.10 

2,84 

0.89 

6 ft-candles 

7 

5.0 

5.2 

6.2 

5.8 

5.46 

S 

3.16 

2.39 

3.96 

4.16 

3.28 

4 

1.41 

1.07 

1.77 

1.86 

0.73 

3 ft-candles 

T 

6.0 

5.2 

5.4 

8.2 

6.2 

Sy 

3.67 

3.42, 

5,5 

4.71 

4.23 

s” 

1,64 

1.53 

2.46 

2.11 

0.94 

For  All  Light 

T 

5.3 

5.35 

6.20 

7.90 



6-19 

L.-vvel  s 

Sy 

3.34 

3.18 

4.11 

4.87 

4.00 

4 

0.75 

0.71 

0.92 

1,09 

0.45 

Although  one  might  expect  that  acoustic  noise  and  ambient  light  world 
strongly  affect  the  production  of  transcription  errors,  no  conclusive 
statistical  significance  as  to  environmental  effects  can  be  adjudged 
from  the  data.  Exemption  of  the  MSE,  however,  shows  that  acoustic  noise 
has  a stronger  effect  on  error  production  than  either  the  Ambient  Light  or 
the  Interaction  of  the  two  (see  tables  II  and  III).  Table  IV  and  V show, 
for  all  light  levels,  the  average  transcription  error  produ*  .Ion 
increased  by  about  60%.  For  all  sound  levels,  the  transcription 
error  did  not  vary  significantly. 

The  operators  chosen  were  all  of  the  same  minimum  proficiency,  each 
able  to  transcribe  messages  at  60  w.p.m.,  with  the  exception  of  one 
trainee.  Thus,  examining  the  variation  of  transection  errors  fur  the 
visual  display  terminal  at  70  dBa  (see  table  V)  for  light  levels  below 
?4-ft  candles,  the  mean  T and  standard  deviation,  Sy,  decrease  from  the 
55  dBa  values,  then  increase  as  noise  is  increased  'to  95  dBa. 


Interviews  with  the  subjects  seem  to  Indicate  that  70  dBa  1$  the  approxl- 
mate  level  of  noise  to  which  they  are  accustomed , and  therefore  they  were 
less  distracted  by  environmental  conges  In  ambient  light  at  this  sound 
level.  The  findings  Indicate  that  for  the  visual  display  terminal  under 
quiet  conditions  (l.e.,  at  55  dBa,  the  noise  below  standard  comcenter 
Operational  levels}  at  lower  levels  of  Ambient  Light,  more  errors  were 
made  than  at  normal  operating  (70dBa)  level.  The  effect  of  noise  at  the 
higher  levels  (80  and  95  dBa;  Indicates  the  variability  and  adaptability 
of  the  operators  to  acoustic  and  photic  noise.  It  was  also  noted  (as  was 
expected  with  the  visual  display  terminal)  that  changing  light  levels  had 
the  least  effect  on  operator  performance. 

Six  multiple  linear  and  non-linear  regression  models  were  fitted  to 
the  data,  by  the  least  squares  method,  to  characterize  operator  performance. 
The  models  were  of  the  form: 


(1)  Y * Bo  + 6 i X i + 82Xj  + e12 


(2)  Y ■ So  + BiXj  + M2  + X»X2  + eJ2 


(3)  Y - Bo  + BjXj  + B2X2  + B,X*  + B„X*  + BjXjX,  + e12 

(4)  Y - So  + BjXj  + 62X2  + B,  X*  + 8„X*  + 8SX*  + B,X*2  + B7X2Xa 

+ B»X*Xa  + B,XtX|  + eu 

(5)  Y * E B^xfxjj  + ^ o<j+ki<3 

(6)  Y - Bo  + BilnXj  + 82Xi  + 6 ln*X  + B„X*  + 65X21nX,  eJ2 

* « 


Where  Y Is  the  observed  operator  response,  Xt  and  X2  are  independent 
variables  corresponding  to  ambient  light  and  acoustic  noise  respectively. 
The  estimated  values  of  the  coefficients,  standard  errors  of  the  estimates, 
and  coefficients  of  determination  are  summarized  In  the  following  table: 
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Clearly,  the  higher  order  model  (4)  fits  the  data  best  on  the  basis 
of  Minimum  residual  variance,  S*  A , and  maximum  coefficient  of  determination, 
*‘rf  (V-Y) 

TM»  provides  the  model : 

Y • 6.788  + 1.752X,  + 0.655X»  ♦ 0.449Xa. 

1 


♦ 0.110XJ  + 0.225X*  - 0.543& 

+ 0.232XiXa  - 0.076X*Xa  - 0.108X,Xj 


Testing  for  fit,  the  sum  squared  error  due  to  regression  and  the  respective 
degrees  of  freedom  for  the  variation  of  Yd  from  the  curve  are  3.378  and 
{9,6}  respectively.  If  the  model  Is  correct,  the  residual  mean  square  has 
the  expected  value  of  oa.  Using  $a  *cra  » 0.5187  ■ MS  , the  "F"  ratio  Is: 

y (y-y)  e c 

F ■ MSc  - 3.378  ■ 3.907 

“R5“  '"'0V518' 

c 


and  Is  not  significant  since  3.907  < 5.520.  Thus,  on  the  basis  of  minimum 
Sa  ^ , maximum  Ra  A and  this  test,  we  have  no  reason  to  doubt  the  adequacy 

(y-y)  yy 

of  this  particular  model.  This  technique  Is  presented  to  show  the  feasibility 
of  using  multiple  least  squares  regression  for  this  type  of  man-machine 
Interface  problem.  A more  sophisticated  approach  Is  planned  at  a later  time 
when  more  data  Is  obtained. 

Conclusions:  Several  adverse  aspects  of  the  terminal  equipment  were 
discovered  which  may  affect  error  production.  The  angle  of  the  keyboard 
(see  figures  4 and  5)  of  the  visual  display  terminal  was  apparently  not 
conducive  to  optimum  performance.  The  teletypewriter  keyboard  was 
unanimously  considered  more  comfortable.  Also,  the  detent  pressure  of 
the  Individual  keys  and  the  absence  of  feedback  "thump"  seemed  to  Increase 
the  probability  of  transcription  error  with  the  visual  display  terminal. 

While  the  results  do  not  show  statistical  significance  of  the  environmental 
effects,  the  trends  In  the  statistics  (particularly  the  MSE  and  overall  means, 
•see  tables  II,  III,  IV  and  V)  Indicate  the  possibility  that  with  a larger 
population  of  more  homogeneous  (as  to  expertise)  subjects,  statistical 
significance  will  emerge.  That  Is,  the  variations  In  human  performance  will 
be  greater  under  abnormal  environmental  conditions.  If  such  abnormal 
conditions  are  to  be  expected  under  battlefield  conditions,  then  significant 

training  Information  could  be  extracted  from  such  a follow-on  experiment. 
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Another  measure  that  could  attain  statistical  significance  Is  the  mean 
transcription  error  production  for  the  group.  Such  statistics  Mould 
Indicate  the  outer  bounds  of  expectation  under  battlefield  conditions. 
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PLANNING  FOR  TOE  MEASUREMENT  OF  FLIGHT  TRAJECTORY 
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ABSTRACT.  This  paper  describes  a procedure  used  at  White  Sands 
Missile  Range,  New  Mexico  for  selecting  instruments  to  measure  a test 
object's  location  and  body  angles.  Criteria  far  selection  inolude 
number  and  location  of  instruments,  types  and  quality  of  measurements , 
probability  of  operation,  and  data  reduction  procedures.  Optimizations 
are  made  in  terms  of  cost-to-support , probability  of  success,  expected 
error  in  data  and  instrumentation  system  used.  Constraints  include 
expected  trajectory  and  object  dimensions,  optical  image  size  and  aspect 
angle,  tracking  rate,  atmospheric  distortion,  and  for  seme  applications, 
locations  of  existing  facilities. 

The  procedure  employs  both  theoretically  and  pragmatically  derived 
models  and  utilizes  observed  error  distribution  and  reliability  data. 

It  has  been  automated  far  computation  on  a UNIVAC  1108  computer. 

1.  INTRODUCTION.  The  purpose  of  this  report  is  to  outline  the 
mathematical  and  statistical  scheme  used  for  the  Resource  Conservation 
Planning  (RCP)  Model.  The  RCP  is  used  as  a tool  for  evaluating  and 
formulating  test  support  plans . 1 The  model  developed  is  formulated 
from  the  multi-station  solution  now  In  use  at  WSMR,  better  known  as  tho 
Davis  Solution.1  This  is  a least-squares  solution  which  is  identical 
to  the  maximum  likelihood  estimates  of  missile  position  In  the  particular 
case  in  whiah  the  instrumentation  measurements  are  normally  distributed. 

In  1965,  1LT  Charles  A.  Hall,  PhD,  expanded  the  least-squares  formulation 
to  provide  an  improved  estimate  and  to  minimize  the  number  of  observations 
required.  This  concept  became  known  as  Minimal  Station  Participation 
(MSPAR). 1 The  RCP  is  an  extension  of  this  concept.  The  scheme  has  been 


lJ.  V.  Carrillo  and  R.  L.  Garcia,  A Technique  for  Canputlng  The 
Probability  of  Meeting  a User's  Trajectory  Requirement,  (A  Tecmical 

Report  Mb. 151,  mstw;'vmr:  " 

*R.  C.  Davis,  Techniques  for  the  Statistical  Analysis  of  Cine theodolite 
Data,  (China  Lake , California , 18515 , page  I*  ~ 1 

*C.  A.  Hall,  Deleting  Observations  From  a Least-Squares  Solution. 
Proceeding  of  the  Eleventh  Conference  on  the  Design  Experiments  In“Srmy 
Research  Development  and  Testing,  ARD-D  Rpt  66-2,  (Durham,  NC,  1966). 
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adapted  to  cinetheodolites,  Telescopes,  Radar,  and  DOVAP  for  position 
and  attitude  applications . The  POP  model  uses  for  input  empirically 
developed  measurement  error  probability  tables  from  each  measurement 
system,  a proposed  flight  test  trajectory  of  a specified  test  objeot, 
and  the  uncertainty  (flight  test  rue- virement s)  in  the  flight  test  data 
that  a Range  User  oan  tolerate  in  r.iV  experiment.  The  probability  tablei 
are  used  to  compute  the  probability  of  a particular  data  error  for  a 
selected  or  given  gocmetry  configuration.  The  final  output  is  in  terms 
of  the  probability  of  meeting  a particular  Range  User  requirement. 

Hence,  oost-to-support  trade-offs  can  be  developed  based  on  the  risk 
a user  may  want  to  taka  in  completing  hie  experiment.  The  less  risk 
the  user  oan  aocept,  the  higher  the  support  cost. 


Restating  the  problem  as:  "Determine  the  probability  of  satisfying 
a Range  User's  requirement  far  a test  object's  position  and/or  attitude 
over  e given  interval , such  that  the  results  will  allow  cost  trade-off 
analyses." 


The  problem  statement  gives  rise  to  the  specific  questions  of  hew  to 
identify  the  minimum  set?  How  to  find  the  probability  of  suocees?'*an3 


. appro* 

question  (as  we  shall  see).  The  latter  two 
ere  the  substance  of  this  paper. 
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ESTIMATION  OF  THE  PROBABILITY  OF  SUCCESS.  Error  estimates  can 


probabilities.  Thus,  they  cen  be  combined  in  a probabilistic  f emulation. 
The  probabilities  Involved  in  the  estimation  of  meeting  a requirement  for 
one  point  of  a trajectory  can  be  expressed  in  equation  form  as: 


P(Rqjnt)^  = l CP(oe*  < Sx  ) x P(Sta  Opr)]^ 


(Eq  1) 


where, 


P(Rqmt)^  = Probability  of  meeting  the  requirement  at  the  1th  point 


Error  In  observed  data 


S 2 = Maximum  allowable  error  from  the  requirement 
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P(Sta  Opr)  ■ The  probability  of  auooeasfUl  station  oparation 


lP(R^>i 

P(Rqprt)  * ili * 

R 


(Eq  2) 


where 


R ■ the  lumbar  of  trajactxxy  points. 

Ths  only  unknown  parumater  in  Equation  1 ia  o0* . oe*  is  found  in  tha 
following  manner.  0 

The  baaio  regression  relationship  ia 


4 * B6 


where, 


♦ * Matrix  of  Observations 

B * Jacobian  Matrix 

8 > Matrix  of  Darivad  Trajectory  Data 


i 


M a £(*) 

0 

Where  c « 2,  3,  4,  ...»  x 

x * total  lumbar  of  sites  available 

\ 

Tha  probability  for  tha  entire  trajectory  ia,  tha  distribution  of  the 
chances  for  aucoess  at  all  pointa  frcm  the  population  of  oocurrenoes 
and  is  found  by  simply  averaging  tha  riak  over  all  pointa; 
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Solving  for  6 


i • ?> 

Jn 


6 a (B*WB)~l  B*W$ 
Oe*  » O^B1*©)"1 


(W  • Weight  Matrix) 


i 

or  | 

1 ' ■ i 

1 

‘ I , V . ■ . 

dga  a (b'H®)’1  for  W = [c^2]"1 

V 8 V<0tB)" ' (Eq  3)  I 

| 

" i 

This  last  equation  (Eq  3)  defines  the  data  error  in  terns  of  Geanetrio  ! 

Dilution  of  Precision  (GDOP)  and  measurement  error}  both  of.  which  are  j 

known  or  knowable.  For  a given  geometry,  (B^B)”1  is  deterministic  | 

while  o^a  is  probabilistic.  Thus,  the  probabilistic  nature  of  a ea 

is  dependent  on  the  probabilistic  nature  of  a^2 . 


In  actual  practice,  a requirement,  Sxa,  is  defined  as  the  trace  of 

a variance-covariance  matrix.  We  may,  therefore,  attack  the  heuristic 
nature  of  o^2  simply  by  introducing  a scalar  "s2". 


Equation  1 baccate 


P(Rqjnt>1  * l [p(S^  < sc^>  x p(sta  0pr)3i 


(Eq  4) 


The  formula  for  ocoputing  tht  probability  that  exactly  M of  N scheduled 
inatruqanta  oporatt  successfully  is: 


P(SUOpr)  r 


(Eq  5) 


V**r*!  Ri*  Rj*  *M  *”*  ****  liability  values  for  instrument  l,  2, 
3,  ...»  M.  Q|t  Qa,  ...,  Qjj  ere  the  (1-R^,  (1-R^,  ...»  (1-Rjj)  values 
for  each  of  the  instruments , respectively.  Note  that  there  are 


N! 

HKN-M5 ! 


terms  to  be  added  in  Eq  S. 

An  example  of  the  computational  procedure  for  a point  ia  shown  in 
Appendix  1. 

4-eg SBjJSS  M3DEL  0N1HE  COMPUTER.  A little  thought  on  the 
ocnputational  times  for  Equation  5 will  lead  one  to  the  realization  tlat 
the  time  will  approximately  double  for  each  additional  site  added.  This 
was  verified  for  the  program  prepared  far  the  UNIVAC  1108  computer:  A 5 


b- 


s. 

f 

fc 

k 

r- 
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station  solution  taking  2 seconds,  11  stations  taking  1 minute,  IS 
stations  taking  14  minutes , etc,  Alternatives  to  minimize  this  problem 
were  (1)  to  improve  the  speed  oi  each  computation  or  (2)  to  reduce  the 
number  of  candidate  sitae.  The  latter  course  was  pursued. 

An  initial  screening  was  derived  based  on  instruments  operating 
limitations. 

OPTICS  - Elevation  Angle  - Between  3°  und  80° 

Image  Size  - >35  Microns  (y)  for  Position 
>100  Microns  (y)  for  Attitude 
RADAR  £ DOVAP  - Elevation  Angle  - Between  10°  and  80° 

Next,  each  surviving  site  is  ordered  in  accordance  with  its 
contribution  to  the  error.  For  each  point,  an  error  constant4  Dj  is 
calculated  from:  3 


for  the  jth  site 


K is  an  index  of  observation  (<J>) 

L is  an  index  of  computed  values  (6) 
L = 1,  2,  3,  . • • , 6 


and 


Hj  = (BtWB)“lB;.tWj>s 

W.  is  a weight  matrix  from  o , 2W  = I 
J 9 

from  o02  * o^tB^)"1 

oe2  = (bH®)"2 


4c.f.,  Ref  1 
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Bj's  then  relate  to  Og2  from 


A 

1 D, 

a .3=1  3 


where, 


A = The  set  of  sites  used 

L = 3 for  Position  data 
2 for  Attitude  data 


The  Dj’s  vary  with  GDOP,  therefore,  the  largest  value  at  one  point  nay  be 

smaller  than  the  smallest  value  at  another  point.  Since  all  points  are 
assumedly  of  equal  importance  to  a customer,  the  ®0P  effect  (D^ ’s)  must 

be  normalized.  This  is  accomplished  by  the  following  scheme.  First,  an 
average  Dj  is  computed.  This  average  value  is  divided  into  each  value 

for  all  points.  Then,  each  site’s  normalized  point  value  is  summed  over 
all  points.  The  sites  are  then  ordered  (largest  to  smallest)  based  on  the 
magnitude  of  the  sum.  The  first  three  sites  (with  the  largest  values) 
are  then  selected  for  the  first  estimate  of  meeting  a user's  trajectory 
requirement.  If  the  probability  of  meeting  the  requirement  is  sufficient, 
the  computation  is  terminated.  If  the  probability  is  insufficient,  the 
site  with  the  next  largest  value  is  added  to  the  computation.  This 
procedure  is  continued  until  the  desired  probability  is  obtained  or  all 
the  sites  in  the  group  are  used.  This  procedure  has  resulted  in  minimizing 
the  number  of  sites  required. 

In  evaluating  the  procedure,  it  was  found  that  the  sites  selected 
produce  the  maximum  P( Rcynt ) 95%  of  the  time;  and  for  the  remaining  5%, 
the  P(Rqmt)  was  within  3%  of  the  rraximum. 
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! 6.  CONCLUSIONS . The  models  discussed  in  this  paper  can  be  used  for 

analyzing  cost-to-support  treble -offs.  Cost-to-support  is  related  directly 
to  the  type  and  amount  of  instrumentation  necessary  to  meet  a particular 
user  requirement.  Thus,  the  output  of  the  RCF  Model  provides  the  information 
necessary  for  risk  analysis  from  a measurement  aspect.  It  is  readily  apparent 
that  the  more  stringent  the  error  requirement  or  the  less  risk  of  data  loss 
i a user  can  accept,  the  higher  the  cost-to-support. 

(There  are  limiations  to  the  model.  First,  since  the  error  and 

reliability  values  used  are  baaed  on  history,  changing  performance  will 
result  in  erroneous  answers;  further,  since  the  present  reduction  process 
is  modeled  in  the  equations,  a change  in  the  procedure  will  necessitate 
revision  of  the  model. 
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NON -RANDOMIZED  CLINICAL  TRIALS 


E.  A.  Oehan  sad  E.  J.  Freireich 
The  University  of  Texas  Syetem  Cancer  Center 
Houston,  Texas 

1 

ABSTRACT 

This  paper  gives  a general  discussion  of  some  principles  Involved  in 
planning  comparative  studies,  namely,  the  objectives,  comparability  of 
patients,  feasibility,  and  ethics.  For  eaoh  principle,  circumstances  are 
given  for  which  a non-randomized  study  is  to  be  preferred  to  a randomized 
one.  Examples  of  non-randomized,  controlled  studies  are  presented  utilizing 
literature  controls,  an  acute  leukemia  late  intensification  study  involving 
matched  controls,  and  an  acute  leukemia  sequence  of  three  studies.  In  the 
latter  example,  adjustment  for  prognostic  factors  was  carried  out  to  enable 
the  studies  to  be  compared  with  respect  to  response  rate  and  survival. 


NON-KANDOMIZED  CLINICAL  TRIALS 


E.A.  Gehan 


E.J  Freireich 


The  University  of  Texas  System  Cancer  Center 


1 . Introduction 

Consider  the  design  of  the  following  Army  experiment  (hypothetical). 
Because  of  the  need  for  saving  money,  an  officer  in  the  Quartermaster  Corps 
does  a study  of  shoe  sizes  for  Army  recruits.  He  finds  that  the  distribution 
of  shoe  sizes  has  several  peaks  and  that  it  would  be  possible  to  save  money  in 
buying  shoes  by  ordering  only  a small  number  of  sizes.  He  decides  that  the 
best  way  to  determine  which  sizes  to  buy  is  from  a randomized  comparative  study. 
His  idea  is  to  issue  threo  sizes  of  shoes:  8H,  9*i  and  10s!  randomly  to  incoming 

recruits  and  theiT  "response"  to  a particular  shoe  will  be  measured  following 
a ten  mile  hike  by  interviewing  and  a physician's  examination.  The  ultimate 
objective  is  to  choose  a single  size  of  shoe  for  ail  recruits.  Mhat  is  wrong 
with  this  experiment?  The  objective  is  stated  clearly,  the  designed  experiment 
could  be  carried  out,  treatments  would  be  assigned  at  random  and  there  wouldn't 
be  much  difficulty  in  measuring  reaction  of  the  recruits  to  the  assigned  shoes. 
It  is  obvious  that  the  whole  experiment  is  ridiculous  because  each  individual 
has  hia  own  shoe  size  and  a choice  of  shoes  should  be  made  accordingly.  Random- 
ization, in  this  case,  added  only  a pseudo-scientific  aspect  to  the  experiment. 


i 
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The  outcome  could  be  predicted  well  and  a great  deal  of  suffering  would  be 
caused  among  the  Army  recruits  selected  frr  the  study  - either  by  randomization 
or  otherwise.  In  clinical  research,  treatment  must  often  be  tailored  to  the 
individual  patient  either  in  terms  of  dosage  or  schedule  and  a randomized  com- 
parative study  is  difficult  to  accomplish  when  treatment  is  individualized. 

Too  often,  randomized  comparative  clinical  trials  are  analogous  to  the  hypo- 
thetical quartermaster  who  proposed  a randomized  comparison  of  shoes  of  different 
iiz*s. 

1 . .1 

In  cancer  clinical  trials  and  in  other  disease  entities,  the  patient  is 
! in  a life  or  death  struggle  against  his  disease.  His  objective  is  to  win  the 
battle  and  he  clearly  would  like  to  be  In  the  hands  of  a physician  who  would  give 
him  the  best  chance  of  winning.  Would  the  best  chance  be  as  a patient  in  a ran- 

i , • 

doiized  comparative  study  or  a*  an  individual  receiving  care  from  an  outstanding 
' physician  who  used  his  best  knowledge  of  patient,  disease  and  treatment  to  choose 
a treatment  plan?  An  analogy  slight  be  the  selection  of  a designer  for  a car  to 
win  the  Indianapolis  S00  mile  race.  Would  a designer  be  chosen  who  did  a random- 
ized comparative  study  of  every  design  feature  to  be  added  to  the  car  or  would 
one  choose  an  experienced  designer  with  a good  record  and  ask  him  to  use  his  best 
judgment  to  design  a car  to  win  the  race.  Not  many  individuals  would  do  random- 
ized comparative  studies  in  an  attempt  to  win  the  Indianapolis  500;  why  then  the 
emphasis  on  randomized  comparative  studies  to  win  the  battle  against  cancer  or 
heart  disease? 

In  this  paper,  a discussion  will  be  given  to  the  general  considerations 
involved  in  planning  a randomized  vs.  non-randomized  comparative  study  and  some 
specific  examples  of  successful  non-randomized  studies  will  be  given.  These 
studies  involve  selection  of  control  patients  from  the  literature,  from  matched 
patients  and  from  the  previous  study  in  a sequence  of  clinical  studies.  Recent 
papers  stressing  the  value  of  non-randomized  studies  are  by  Gehan  and  Freireich 
(1974)  and  Freireich  and  Gehan  (1974). 
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2.  General  Considerations  j 

Pour  aspects  of  the  comparative  clinical  trial  will  be  considered.  These  j 
are:  (a)  objectives;  (b)  comparability  of  patients;  (c)  feasibility;  and  (d)  | 


ethics. 


(a)  Objectives 


Chalmers,  Block  and  Lee  (1972)  have  published  a paper  on  controlled  clin- 
ical trials  in  which  the  main  theme  is  illustrated  by  a humorous  conversation 
between  two  biostatisticians.  First  biostatistician,  “How's  your  wife?".  Second 
biostatistician,  "Compared  to  whom?".  The  humor  of  this  parable  emphasizes  two 
important  and  distinctive  facts  about  the  man's  wife:  the  first  being  how  does 
his  wife  differ  from  other  wives,  a comparative  fact;  the  second,  how  is  his 
wife  in  his  own  judgment,  that  is,  what  is  his  estimation  of  his  wife's  capabil- 
ities. This  fundamental  difference  is  frequently  overlooked  in  the  design  and 
conduct  of  a clinical  study.  It  should  be  emphasized  that  an  important  result  of 
a therapeutic  investigation  is  the  measurement  in  a quantitative  sense  of  the 
effectiveness  of  a given  treatment.  There  are  situations  in  which  the  important 
question  is  not  how  effective  is  this  treatment,  but  is  this  treatment  more  or 
less  effective  than  a standard  or  some  other  form  of  treatment.  In  general,  the 
latter  question  is  not  as  significant  as  the  former  - for  both  treatments  and 


wives. 


An  essential  ingredient  of  clinical  research  is  a significant  objective. 


Too  often  the  concept  of  randomization  is  equated  with  the  concept  of  research 
while  non- randomization  is  equated  with  "non-scientific"  or  "uncontrolled".  One 
cannot  replace  the  intelligent,  imaginative,  creative  work  of  a clinical  scientist 
with  the  routine  application  of  a clinical  trial  technique.  In  cancer  research, 
there  are  many  examples  of  non-randomized  studies  that  have  led  to  important  alter- 
ations in  methods  of  treating  patients.  Examples  are  the  discovery  of  mechlore- 
thamine  in  the  treatment  of  Hodgkin's  disease,  the  first  antimetabolite  methotrexai 
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in  the  treatment  of  patients  with  acute  leukemia,  vincristine  in  acute  leukemia, 
and  combination  chemotherapy  in  lymphoma  and  Hodgkin's  disease.  These  were  all 
dramatic  advances  in  the  treatment  of  patients  with  malignant  disease  and  this 
cnowledge  was  derived  from  non-randomized  clinical  studies.  What  new  and  effec- 
tive treatments  have  been  discovered  utilizing  randomized  clinical  studies? 

(b)  Comparability  of  patients 

As  A.B.  Hill  (1962)  has  put  it,  a sine  qua  non  in  the  proper  conduct  of 
} controlled  clinical  trial  is  having  comparable  groups  of  patients.  A clinical 

i 

i 

.rial  designed  to  evaluate  the  relative  effectiveness  of  two  or  more  treatments 

1 

jihould  be  planned  so  that  the  only  differences  among  treatment  groups  are  in  the 
|ctual  treatment  received.  This  requires  comparability  of  patients  as  they  are 
jntered  into  study,  managed  when  on  study,  and  analyzed  when  the  study  is  completed. 

The  entry  of  patients  will  be  discussed  here  and  one  technique  for  achiev- 
ng  comparability  of  patients  is  randomization,  possibly  stratified  so  that  there 

! 

ire  separate  randomizations  of  patients  in  prognostic  categories.  Even  the  pro- 

jonents  of  randomization  agree  that  randomization  guarantees  comparability  of 

j 

jatients  on  tho  average  and  this  needs  to  be  checked  in  every  clinical  trial.  It 
jay  even  be  argued  that  randomization  is  a guarantee  of  non-comparability  of  treat- 
ment groups  with  respect  to  some  patient  characteristics,  if  enough  patient  char- 
acteristics are  examined.  For  example,  if  there  were  a 5%  chance  that  the  random 

i 

[signment  of  patients  would  lead  to  a significant  difference  between  treatment 
oups  with  respect  to  a given  patient  characteristic  and  the  distribution  of  20 
^laracteristics  were  considered,  it  would  be  expected  that  there  would  be  a sig- 


jlficant  imbalance  between  groups  with  respect  to  at  least  one  characteristic, 
js  Daniel  (1970)  has  pointed  out,  "Randomization  is  a confession  of  ignorance, 
jill  randomization  is  a confession  of  full  ignorance."  In  other  words,  a full 
^ndomization  should  be  accomplished  only  when  a clinical  investigator  is  not 
tnizant  of  any  patient  characteristics  that  influence  prognosis. 
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Another  technique  for  achieving  comparability  of  patients  at  time  of 
entry  into  study  is  to  select  patients  for  a control  group  according  to  certain 
characteristics,  namely  those  which  are  known  to  influence  prognosis.  If  treat- 
ment A is  the  treatment  under  study  and  treatment  B is  a standard  or  "control" 
treatment  which  is  to  be  compared  with  A,  the  control  group  of  B patients  could 
be  selected  from  the  literature,  chosen  on  a matched  basis  from  previously  or 
concurrently  conducted  clinical  studies,  or  selected  from  the  previous  study  in 
a soquence.  The  primary  assumption  needed  for  selecting  a control  group  is  that 
the  important  patient  characteristics  related  to  prognosis  are  known,  so  that 
there  is  a firm  basis  for  selecting  a comparable  group  of  patients.  Further,  it 
must  be  assumed  that  differences  which  do  exist  between  the  groups  selected  (such 
as  time,  institution,  physician,  or  the  availability  of  supportive  care)  have  little 
or  no  relation  to  the  outcome  of  the  treatment.  In  a disease  which  has  been 
studied  extensively,  techniques  of  regression  analysis  can  be  used  to  determine 
patient  characteristics  related  to  prognosis.  See  Armitage  and  Gehan  (1974)  for 
a review  of  available  methods.  Some  examples  will  be  ’* -cussed  in  section  3. 

(c)  Feasibility 

In  general,  the  feasibility  of  a particular  study  relates  to  the  number 
of  patient3  required  *ud  its  duration.  For  a particular  investigator  or  group 
of  clinical  investigators,  one  can  compare  the  strategy  of  proceeding  from  one 
fairly  large  study  to  the  next,  each  based  on  a single  treatment  vs.  the  strategy 
of  randomizing  between  two  treatments  in  each  study.  Suppose  the  investigators 
in  both  circumstances  has  exactly  the  same  requirements  concerning  the  number  of 
patients  to  be  studied  on  each  treatment.  Suppose  the  number  required  for  each 
treatment  is  N and  the  group  of  investigators  accrues  this  number  of  patients  in 
one  year.  Assuming  that  no  follow-up  period  is  required  for  observing  the  effect 
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of  treatment,  the  strategy  of  proceeding  sequentially  from  one  study  to  the 
next  means  that  one  year  will  be  required  for  each  study.  The  investigator  who 
always  randomizes  between  two  treatments  requires  two  years  to  complete  each 
study.  It  is  true  that  at  the  end  of  two  years,  an  investigator  following  either 
strategy  will  have  evaluated  two  treatments,  however  the  investigator  who  does 
sequential  studies  will  have  an  opportunity  to  choose  a second  treatment  based 
upon  the  results  of  the  fiTst.  Further,  some  investigators  adopt  the  practice 
of  always  carrying  along  the  best  treatment  from  a previous  study  in  the  current 
study;  this  results  in  evaluating  three  treatments  every  four  years  compared 
with  four  treatments  for  the  investigator  who  proceeds  sequentially.  The  latter 
investigator  will  have  had  the  opportunity  to  build  upon  knowledge  gained  from 
previous  studies  to  choose  three  treatments,  while  the  investigator  preferring 
simultaneous  comparisons  will  have  chosen  only  one  new  treatment  based  upon  the 
results  of  a previous  study. 

Suppose  an  investigator  is  doing  a simultaneous  comparison  of  treatments 
A and  B in  which  a fixed  number  of  patients  is  to  receive  each  treatment  so  that 
the  difference  in  response  rates  can  be  detected  at  a given  significance  level  and 
power  of  test.  These  specifications  lead  to  n patients  being  required  on  each 
treatment  and  tables  of  n are  readily  available  in  textbooks  (Cochran  and  Cox, 
1957}  (Holland  and  Frei,  1973).  An  experimenter  who  does  studies  in  sequoncc  of 
one  treatment  might  be  prepared  to  assume  that  the  response  rate  to  the  control 
treatment  (B)  is  so  well  known  that  it  may  be  taken  as  a fixed  quantity,  say  p, 
and  no  patients  need  receive  B in  the  trial.  To  carry  out  a statistical  test  of 
the  difference  between  the  proportion  of  patients  responding  to  A and  B at  the 
same  significance  level  and  power  assumed  above,  only  n/2  patients  are  needed  on 
treatment  A,  which  is  only  1/4  the  total  number  of  patients  required  for  the  ran- 
domized comparative  trial.  When  the  cost  of  supporting  clinical  studies  is  often 
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in  excess  of  11,00b, 000  per  year,  a savings  of  patients  and  duration  of  study 
has  a substantial  dollar  equivalent.  Even  when  the  response  rate  to  the  con- 
trol treatment  is  not  known  precisely,  it  aay  still  be  reasonable  to  proceed 
as  if  it  is  known.  For  example^  in  the  treataent  of  patients  with  advanced  lung 
cancer,  the  expected  percentage  of  patients  responding  to  standard  treatment  is 
very  low  (less  than  20%)  and  survival  is  poor.  In  this  c ire  instance,  it  would 
be  sensible  to  test  a proposed  therapy  against  a specified  percentage,  say  20%. 

The  objective  would  be  to  find  a new  treataent  that  has  a response  percentage 
significantly  higher  than  20%. 

(d)  Ethics 

All  clinical  investigators  seek  results  which  demonstrate  that  the  overall 
prognosis  for  patients  is  getting  better.  Clinical  trials  in  which  patients  do 
less  well  than  they  have  in  the  past  are  to  be  avoided  at  all  posts  and  to  be  con- 
cluded as  early  as  possible.  A comparative  clinical  trial  should  not  be  started 
unless  there  is  some  preliminary  evidence  suggesting  that  the  new  therapy  is  at 
least  as  good  and  possibly  better  than  the  standard.  If  this  is  accepted,  the 
question  can  be  raised  whether  it  is  ethical  to  enter  patients  on  the  standard 
therapy  when  there  is  little  or  no  chance  that  the  standard  could  be  better  than 
the  new  therapy.  That  is,  the  objective  should  be  to  study  the  new  therapy  until 
it  can  be  concluded  whether  the  new  therapy  is  significantly  more  effective  than 
the  standard  or  not.  Study  of  the  new  therapy  could  be  stopped  when  the  probability 
of  its  being  more  effective  than  the  standard  becomes  very  low.  < 

The  clinical  investigator  conducting  studies  in  sequence  of  treatments  is  - "l 

always  giving  what  he  considers  to  be  the  best  treatment  to  his  patients.  Re- 

. ■* 

cruitment  of  patients  to  a clinic  to  receive  this  treataent  is  much  easier  than  * 

for  the  investigator  who  proceeds  by  simultaneous  comparisons.  The  former  inves- 
tigator can  promise  all  patients,  even  those  who  come  froa  long  distances,  that 
they  will  receive  what  the  investigator  thinks  is  the  current  best  treatment.  The 
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latter  type  of  Investigator  can  promise  only  that  the  choice  of  treatment  will  be 
determined  essentially  by  flipping  a com  and  that  the  treatments  in  the  clinical 
trial  are  reasonably  good  ones. 

Neie^r  (1975)  has  stated  the  ethical  problem  as  follows:  'The  view  is 

often  expressed  that  each  patient  must  be  afforded  the  prosumed  benefit  of  any 

, * ' ' 

estimated  advantage  of  one  treatment  over  another,  regardless  of  how  slight  or 
uncertain  that  advantage , may  be.  I insist  that  this  view  does  not  reflect  ay 

1 i 

attitude  Shout  myself  as  a patient,  nor  dots  it  reflect  the  attitude  of  most  of 
us.  Make  no  mistake  about  it,  this  position  is  incompatible  with  any  experimenting 
whatever,  controlled  or  casual.  It  does  not  favor  judicious  experimenting  with  a 
new  technique  or  drug  on  carefully  selected  patients.  That,  after  all,  can  be  done 
in  a controlled  study.  Rather,  it  forbids  any  experimenting  at  all."  The  ethical 
dllomma  disappears  if  one  proceeds  sequentially  In  evaluating  treatments  - the 
presumed  best  trestmont  is  always  being  given.  However,  what  Meier  and, many  other 

■ • ' o'  , 

statisticians  do  not  accept  it  that  conducting  itudies  in  sequence  dan  resolve 
the  scientific  problem  of  properly  evaluating  the  relative  effectiveness  of  treat- 
ments. This  will  be  demonstrated  by  some  examples  from  cancer  clinical  trials. 


i/. , 


3.  Examples  of  Non -Random! zed  Clinical  Trials 

In  this  section,  some  examples  of  non-randomized  clinical  trials  are  given 
in  which  patients  in  the  control  group  were  selected  to  be  comparable  to  those 
receiving  a study  treatment.  Patients  in  the  control  group  were  selected  based 
upon  their  prognostic  characteristics  and  the  assumption  was  made  in  all  studies 
that  the  patient  characteristics  chosen  accounted  for  the  major  proportion  of  the 
pat lent- to -pat lent  variability  in  response.  Literature  controls,  matched  controls, 
and  patients  from  a sequonce  of  studies  will  be  considered  in  relation  to  the  eval- 
uation of  study  treatments. 


\ 

si 
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(a)  Literature  Controls 

In  all  circumstances  in  which  the  same  or  similar  treatments  have  been 
used  by  others  in  a clinical  investigation,  it  is  desirable  to  use  these  patients 
as  controls,  even  when  there  is  also  an  internal  group  of  control  patients  in  the 
trial.  Unfortunately,  it  is  usually  true  that  authors  do  not  provide  sufficient 
data  in  their  papers  so  that  it  can  be  checked  whether  the  patients  reported  in 

the  literature  are  comparable  to  those  in  a given  clinical  trial.  It  certainly 

1 ) 

would  be  helpful  if  authors  and  those  engaged  in  large  cooperative  group  studies 
cotild  make  available  basic  data  on  punch  cards  or  computer  tape  so  that  others 
might  use  the  data  for  literature  controls. 

An  example  of  a literature  control  group  is  given  in  the  study  reported 
by  Luce  et  al  (1971)  in  which  combined  cyclophosphamide,  vincristine  (Oncovin), 
and  prednisone  therapy  (COP)  for  malignant  lymphoma  was  compared  to  single  agent 
treatment  with  cyclophosphamide  or  a vinca  alkaloid  (vinblastine  for  Hodgkin's 
disease  and  vincristine  for  lymphosarcoma)  as  reported  by  Carbone,  Spurr,  et  al 
(1968).  All  patients  in  both  studies  had  stage  III  or  IV  disease.  However,  patients 
who  had  received  major  prior  chemotherapy  or  those  with  moderately  impaired  bone 
marrow  reserve  were  excluded  from  the  Carbone  study.  Thus,  in  terras  of  prior 
treatment  and  bone  marrow  reserve  - two  important  prognostic  factors  - patients  who 
had  received  little  or  no  prior  treatment  in  the  Luce  study  were  comparable  to  those 
in  the  Carbone  study.  The  age  and  sex  distributions  were  similar  in  the  studies. 
Hence,  when  adjustment  was  made  for  prior  therapy,  it  could  be  concluded  that 
patients  in  the  Carbone  study  were  comparable  to  those  in  the  Luce  study.  The 
complete  remission  rate  following  COP  treatment  was  36-50%  in  malignant  lymphoma 
compared  with  6-20%  for  the  single  agent  treatment  reported  by  Carbone.  In  addition, 


other  series  of  patients  receiving  either  single  agents  or  COP  treatment  by  a 
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slightly  different  schedule  had  similar  results.  Because  both  single  agents  and 
COF  had  response  rates  that  were  consistent  from  one  study  to  the  next  and  the 
evidence  that  COP  was  significantly  superior,  it  seemed  safe  to  conclude  that  COP 
was  superior  to  single  agent  treatment  in  the  induction  of  complete  remissions. 

Another  example  is  that  given  by  Sut6w  et  al  (1970)  in  which  the  survival 
experience  of  patients  with  Wilm's  tumor  or  neuroblastoma,  first  treated  in  1962, 
was  compared  to  that  of  pdtients  first  treated  in  1956.  A total  of  3$  institutions 
participated  in  the  study  and,  for  patients  with  Wilm's  tumor,  it  was  demonstrated 
that  the  age  distribution,  percentage  of  children  with  metastases,  and  Intensity 
of  surgical  and  radiation  therapy  were  comparable  between  the  two  time  periods. 
However,  94%  of  patients  received  drug  therapy  (mainly  actinomycin-D,  vincristine, 
and  cyclophosphamide)  in  1962  compared  with  28%  in  1956.  A significant  improval 
in  survival  was  demonstrated  for  patients  of  all  ages  without  meatstases  and  for 
patients  two  years  or  older  with  motastases.  The  outhors  concluded  that  the  in- 
creased clinical  use  of  chemotherapeutic  agents  resulted  in  the  significant  improve- 
ment in  the  survival  curves.  For  patients  with  neuroblastoma,  though  there  was 
a slight  difference  in  the  survival  experience  for  both  non-metastatic  and  meta- 
static patients  favoring  those  first  treated  in  1962,  the  difference  was  not  near 
statistical  significance  and  it  was  concluded  that  the  increased  use  of  chemo- 
therapeutic agents  did  not  result  in  a significant  improvement  in  survival  time. 

A literature  control  group  is  useful  when  patients  can  be  checked  for 
comparability  and,  in  some  circumstances,  when  it  can  be  demonstrated  that  patients 
in  the  literature  have  more  favorable  prognostic  indicators.  Authors  should  be 
encouraged  to  have  details  of  their  data  available  to  others  for  comparison  purposes. 

(b)  Matched  Controls 

In  a matched  control  study  in  which  patients  are  to  be  selected  from  a 
group  of  patients  treated  in  the  past,  all  new  patients  would  receive  the  treatment 


51 


I 


s 

f 

i 


to  be  evaluated,  say  treatment  A.  A pairmate  for  each  patient  receiving  A would 
be  chosen  at  random  from  among  tho  possible  pairmates  in  the  group  of  historical 
control  patients  who  received  treatment  B.  The  applicability  of  this  approach 
depends  upon  having  a sufficiently  large  group  of  patients  for  potential  pairmates. 
Patients  obtained  by  this  process  who  receive  treatment  A would  be  as  comparable 
as  possible  to  those  on  treatment  B with  respect  to  the  patient  characteristics 
used  as  a basis  for  the  pairing.  If  sufficient  patients  are  available,  it  may  be 
desirable  to  select  two  control  patients  for  each  treated  patient,  making  a com- 
parison between  control  patients  to  test  the  selection  process. 

An  example  of  this  type  of  study  is  given  by  Bodey  et  al  (1976}  who  com- 
pared the  length  of  complete  remission  for  patients  with  acute  leukemia  between 
two  groups:  a study  group  receiving  late  intensification  chemotherapy  and  immuno- 
therapy a median  of  89  weeks  (range  of  58  to  194  weeks)  after  achievement  of  com- 
plete remission  vs.  a matched  control  group  of  patients  who  received  maintenance 
therapy  at  monthly  intervals,  generally  the  same  therapy  that  induced  the  remis- 
sion. The  objective  of  the  late  intensification  study  was  to  cure  the  patient  by 
administering  an  intense  program  of  therapy  with  new  agents  when  the  leukemia  cell 
population  was  at  a minimum.  Patients  were  matched  by  age  group,  cell  type,  and 
length  of  remission  prior  to  the  start  of  late  intensification  therapy.  There 
were  17  patients  in  the  matched  control  gToup  and  19  in  the  group  receiving  late 
intensification  therapy  (matched  controls  could  not  be  found  for  two  patients) . 

The  median  duration  of  complete  remission  subsequent  to  late  intensification  ther- 
apy has  not  yet  been  reached  but  will  be  in  excess  of  98  weeks,  only  5 patients 
relapsing  of  19.  The  median  length  of  subsequent  remission  in  the  matched  control 
group  was  24  weeks  and  there  is  a highly  significant  statistical  difference  be- 
tween the  two  remission  curves  (P<.01).  Comparing  survival  times  between  groups, 


52 


j 


4 

1 

I 


*3 


£|gg| 


16  of  the  19  patients  receiving  late  intensification  treatment  are  still  alive 
and  their  median  follow-up  time  is  97  weeks.  The  median  survival  time  for  patients 
in  the  matched  control  group  is  56  weeks  and  the  difference  between  curves  was 
highly  statistically  significant  (P<.01).  Thus,  this  study  has  demonstrated  the 
importance  of  a new  concept  in  the  treatment  of  patients  with  acute  leukemia  that 
may  have  resulted  in  a cure  of  some  patient*. 

Another  study  by  Bodey  et  al  (1971)  in  patients  with  acute  leukemia  demon- 
strated  that  patients  in  a protected  environment  (PE)  receiving  prophylactic  anti- 
biotics and  chemotherapy  had  significantly  better  length  of  complete  remission 
(median  of  55  weeks  for  PE,  26  weeks  for  controls),  length  of  survival  (median  of 
34  weeks  for  PE,  23  weeks  for  controls),  and  percentage  of  days  spent  with  infec- 
tion as  related  to  neutrophil  count  than  a matched  control  group  of  patients 
treated  outside  a protected  environment. 

(c)  Controls  Soloctcd  from  a Sequence  of  Studios 

There  are  many  cooperative  groups  engaged  in  cancer  research  in  the  USA 
who  proceed  from  one  study  to  the  next.  Generally,  there  is  little  change  over 
short  intervals  of  time  in  institution,  type  of  patient,  criteria  for  diagnosis 
and  response,  and  availability  of  supportive  therapy.  In  this  circumstance,  it 
is  sensible  to  compare  results  from  a previous  study  with  those  of  a current  one. 
Using  patients  from  a previous  study  as  controls  might  be  misleading  if  a rela- 
tively long  time  interval  had  elapsed  between  studies  (say  greater  than  3 years) 
or  if  it.  could  bo  demonstrated  that  important  changes  had  taken  place  with  respect 
to  clinical  investigators,  type  of  patient,  criteria  for  evaluation,  etc.  There 
are  about  25  cooperative  groups  in  the  United  States  supported  by  the  National 
Cancer  Institute  that  proceed  directly  from  one  study  to  the  next,  have  a stable 
group  of  clinical  investigators,  see  the  same  types  of  patients  from  year  to  year, 
have  the  same  access  to  supportive  therapy  measures  and  generally  use  the  same 


criteria  of  response  in  successive  studies.  Using  patients  from  a previous  study 
as  controls  would  often  be  feasible  for  such  cooperative  groups. 

Examples  from  studies  conducted  by  the  Southwest  Oncology  Group  demonstrate 
that  the  same  treatment  administered  in  successive  studies  may  be  expected  to  lead 
to  the  same  general  result.  In  consecutive  studies  of  previously  untreated  pedia- 
tric patients  with  acute  leukemia,  the  complete  bone  marrow  remission  rates  for 
patients  treated  with  vincristine  plus  prednisone  were  83%  (72/87)  in  the  Al.inC  #6 
study  and  86%  (237/276)  in  the  ALinC  »7  study  (Lonsdale  et  al,  1975).  In  consecu- 
tive studies  of  patients  with  Hodgkin's  disease,  the  complete  remission  rate  fol- 
lowing MOPP  treatment  has  remained  very  close  to  80%  for  previously  untreated 
patients  with  stage  III  or  IV  disease. 

When  consocutive  studies  of  different  treatments  have  been  conducted,  re- 
gression models  can  be  utilized  to  test  whether  there  are  significant  treatment 
differences,  adjusting  for  values  of  the  prognostic  characteristics  in  the  succes- 
sive studies.  If  response  is  the  end  point  for  analysis,  stepwise  logistic  re- 
gression procedures  can  be  carried  out  to  interpret  the  data  (Cox(1970),  Lee  (1974)) 
If  survival  or  length  of  response  is  the  end  point,  Cox's  regression  model  (Cox  (197 
may  be  used.  An  example  will  be  given  from  successive  studies  conducted  in  the 
Southwest  Oncology  Group. 

Over  the  past  several  years,  the  Southwest  Oncology  Group  (SWOG)  has  con- 
ducted the  following  clinical  studies  in  patients  with  adult  acute  leukemia:  COAP 
vs.  GAP  vs.  DOAP  (from  2/71  to  10/72);  a 10-day  OAP  study  (from  6/73  to  1/75); 
and  a C1AL  study  (from  1/75  to  present).  The  designations  of  the  drugs  are  as 
follows:  C«Cyclophosphamide,  0«Vincristine  (Oncovin),  A**Cytosine  Arabinoside,  am* 

P=Prednisone.  The  CIAL  study  in  the  remission  induction  phase  consisted  of  givir 
vincristine  plus  prednisone  to  all  patients  with  less  than  30,000  blasts  in  the 


! 

'peripheral  blood.  For  patients  with  30,000  or  more  blasts,  patients  were  random- 

iizcd  between  sequential  vs.  simultaneous  adriamycin-OAP  treatment.  In  the  first  J 

study,  OAP  was  given  by  continuous  infusion  over  a period  of  five  days. 

I 

t -- 

| The  complete  remission  rate  for  5-day  OAP  w*s  43%  (39/90),  that  for  10-  j 

| i: 

i day  OAP  was  53%  (92/173),  and  the  current  complete  response  rate  for  patients  in  j 

| 

the  combined  groups  on  CIAL  is  60%  (70/117).  The  question  arises,  do  these  data 

[ T 

.‘indicate  significantly  improved  complete  remission  rates  by  study,  or  is  there 
evidence  that  the  types  of  patients  on  the  three  studies  might  explain  the  dif-  | 

I ' V 

•ferences  in  complete  remission  rates? 

i 

Prom  previous  studies  in  adult  acute  leukemia,  the  following  patient  char- 
acteristics have  been  identified  as  being  predictive  of  response:  age  (years), 

-infection  status  at  start  of  study  (0>no,  l»yes),  acute  myelocytic  leukemia  (0»no, 

\ ] 

;l-yes),  hemoglobin  value  (gm3  %) , and  logarithm  (white  blood  count).  These  five  i 

i ! 

’patient  characteristics  and  two  variables  representing  the  linear  and  quadratic 

£ j 

jeffect  of  treatments  were  included  in  a logistic  regression  equation.  The  regres- 

r 

\ 

sion  equation  obtained  is  as  follows: 

J » ♦ .1276-  . 0417(Age-44. 73)  + .5027(Treat. linear-. 101) 

- .7000(Infection  status-. 388)  - .3806 (AML- .830) 

♦ .0S01(Memoglobin-9. 21)  - .0597(log(WBC)-4.144) 

♦ .0207(Treat. quadratic*. 407) 
jwhere  is  thv  predicted  complete  remission  rate  based  on  the  7 patient  characteristics. 

i’ 

The  coefficients  in  the  equation  were  determined  by  stepwise  logistic 
^regression  (Lee(1974))  so  the, significance  level  of  each  entering  characteristic 
can  be  calculated.  The  statistical  significance  level  of  each  entering  variable 
was:  age  (P<.01),  treatment  linear  (P<.01),  infection  status  (P<.01),  AML  (P-.18), 

Jhcmoglobin  (P=.33),  log  WBC  (P».76)  and  treatment  quadratic  (P«.80).  This  analysis 
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demonstrates  that  there  is  statistically  significant  evidence  of  a linear  increas- 
ing trend  in  response  rate  by  study  and  that  age  and  infection  status  are  signi- 
ficantly related  to  response  rate. 

Evidence  that  the  five  patient  characteristic*,  do  predict  complete  remis- 
sion rate  is  given  in  Table  1.  A logistic  regression  equation  was  fit  to  the  five 
patient  characteristics  in  the  S and  10-day  OAP  studies  (excluding  treatment  as  a 
possible  characteristic).  This  equation  is  as  follows: 

log|l~-j  - .02888  - .04238(Age-. 44031) 

- . 59297(Infection  status-. 37)  - .3S854(AML- .872) 

- .01431 (Hemoglobin-9. 155)  - .0208(iog(WBC)-4. 127) . 

Table  1 gives  the  observed  and  predicted  numbers  of  patients  responding 

on  the  10-day  OAP  and  CIAl  studies.  As  would  be  expected,  the  relationship 
between  observed  and  predicted  probability  of  response  was  excellent  for  the  10- 
day  OAP,  since  the  equation  is  being  re-applied  to  the  some  data  from  which  it 
was  derived.  Note  that  there  is  also  a good  relationship  between  observed  and 
predicted  probability  of  response  for  patients  on  the  CIAL  study.  The  observed 
percentages  responding  were  higher  than  predicted  in  patients  with  predicted  pro- 
babilities under  .60  and  were  in  accord  with  predictions  for  patients  with  pre- 
dicted probabilities  over  .60.  Hence,  there  is  some  evidence  that  patients  on 
the  CIAL  study  produced  higher  observed  responses  in  patients  with  relatively 
low  predicted  probabilities  of  response.  When  the  equation  was  applied  to  the 
patients  from  5-day  OAP,  the  predicted  complete  remission  rate  was  52.1%;  it 
was  50.0%  for  patients  on  10-day  OAP,  and  50.8%  for  CIAL.  Hence,  there  was 
strong  evidence  that  patients  on  all  three  studies  were  comparable  with  respect 
to  the  five  patient  characteristics. 
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Cox's  regression  model  was  fit  to  the  survival  data  from  the  three 
studies  using  the  same  five  patient  characteristics  and  treatment  variables  as 
in  the  analysis  of  response,  Cox's  model  may  be  written  as  follows: 

X(t)  - exp  S2(x2-x2)*  ...  ♦ Bp(xp-Xp)}  VO 

where  X(t)  is  the  hazard  function  at  time  t,  the  B's  are  regression  coefficients, 
the  x's  are  patient  characteristics  potentially  related  to  survival,  the  7's  are 
average  values,  and  AQ(t)  is  an  arbitrary  hazard  function  when  all  the  x's  are 
4t  their  mean  values.  The  model  fit  to  the  survival  data  from  the  three  studies 
is  as  follows: 

* ♦ ,0319(Age-44 . 74)  - .4269(Treat. linear- . 10) 

+ .4978(Infection  status*. 39)  + . 1435(log(WBC)-4. 14) 

- .0429(Treat. quadratic*. 41)  - .0097 (Hemoglobin-9. 21) 
♦ ,0006(AML-.83). 

The  model  was  fit  in  forward  stepwise  fashion  and  the  statistical  sig- 
nificance of  adding  variables  at  each  step  was  as  follows:  age  (P<.01),  treatment 

linear  (P=.001),  infection  status  (P*.001),  log  (WBC)  (P-.30),  treatment  quadra- 
tic (P«.39),  hemoglobin  value  (P*,77)  and  AML  (P».99).  Hence,  as  in  the  analysis 
of  response,  age  and  infection  status  are  the  two  characteristics  most  signifi- 
cantly related  to  survival  time  and  there  is  evidence  of  a linear  trend  which 
indicates  increasing  survival  time  by  study.  Figure  1 gives  the  survival  curves 
for  patients  on  the  three  studies.  The  median  survival  time  for  patients  receiv- 
ing 5-day  OAP  was  7 weeks,  that  for  patients  receiving  10-day  OAP  was  38  weeks, 
and  the  median  has  not  yet  been  reached  for  patients  on  the  CIAL  study.  There  is 
evidence  of  a significant  advantage  in  survival  for  10-day  vs.  5-day  OAP  patients 
(P<.015)  and  nearly  significant  evidence  that  CIAL  has  superior  survival  to  10- 
day  OAP  (P* . 059) . 
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These  regression  analyses  have  permitted  comparison  to  be  made  among 
treatment  programs,  adjusting  for  patient  characteristics  related  to  prognosis. 
Based  upon  those  analyses,  one  could  moTe  confidently  assert  that  there  were 
real  differences  in  response  rate  and  survival  among  the  three  studies  because 
patient  characteristics  were  adjusted  for  in  both  analyses,  patients  were  com- 
parable in  the  three  studies  with  respect  to  predicted  probability  of  complete 
remission,  and  the  same  patient  characteristics  (namely,  age  and  infection 
status)  were  significantly  related  to  response  and  survival. 


4.  Discussion 

The  point  of  view  has  been  presented  that  rational,  scientific,  and 
controlled  clinical  studies  can  be  accomplished  without  randomization.  In  some 
circumstances,  patients  that  are  comparable  in  prognosis  can  be  identified  in 
successive  studies  which  allow  comparison  between  a group  of  patients  under  inves- 
tigation and  other  groups  treated  in  the  past.  Recording  data  which  differs 
significantly  from  that  observed  in  the  past  form%  the  basis  for  new  knowledge. 
Confirmation  of  data  by  the  same  investigator  and  by  other  investigators  in  other 
institutions  provides  a convincing  mechanism  for  generating  knowledge  which  pre- 
dicts for  the  future. 

The  major  reasons  for  preferring  the  non- random! zed  to  the  randomized 
study  are:  a clinical  investigator  in  a non- randomized  3tudy  is  always  adminis- 

tering what  he  believes  to  be  the  best  treatment  for  the  disease  under  investi- 
gation so  there  is  no  ethical  dilemma,  and  non-randomized  studies  require  fewer 
patients  and  proceed  more  quickly  so  that  new  knowledge  is  gained  faster. 

Randomized  studies  are  useful  if  there  is  no  basis  for  choosing  comparable 
patients  treated  in  the  past  since  patient  characteristics  related  to  prognosis 
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are  unknown.  Also,  such  studies  could  be  considered  when  there  is  no  prelimi- 
nary evidenco  that  one  treatment  is  substantially  better  than  another  so  that 
the  ethical  dilemma  does  not  really  arise.  Thirdly,  previous  data  will  sometimes 
suggest  that  the  same  treatment  program  be  studied  according  to  different  dosages 
or  schedules,  etc.,  and  it  is  convenient  to  have  these  treatments  in  the  same 
study.  Fourthly,  when  studies  are  to  be  conducted  over  a very  long  term  (say, 

3-5  years  or  more)  then  patients  could  be  randomized  because  there  was  genuine 
doubt  that  the  ancillary  aspects  of  the  successive  studies  would  be  comparable. 

In  planning  any  clinical  trial,  there  is  no  substitute  for  imaginative, 
original,  and  creative  thought.  The  best  clinical  trials  are  those  that  have 
the  best  treatments  in  them,  whether  randomized  or  not.  CMnical  knowledge  will 
advance  when  there  has  been  careful  analysis  of  past  results  as  a basis  for  the 
formulation  of  significant  hypotheses  to  be  tested  in  objective  and  scientifically 
valid  studies. 
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ABSTRACT.  The  Army  needs  Information  about  how  well  an  individual 
can  perform  the  tasks  necessary  for  him  to  do  his  job.  This  information 
is  often  gathered  by  means  of  a "criterion-referenced  test,"  a test  made 
up  of  items  directly  related  to  the  job  of  interest.  The  test  results 
can  be  used  in  two  ways.  The  first  way  is  to  sort  individuals  into  two 
groups,  one  made  up  of  those  who  can  perform  their  job  satisfactorily 
and  the  other  made  up  of  those  who  do  not  meet  minimal  job  requirements. 

A second  use  of  the  test  results  is  to  estimate  the  "true"  capability 
of  the  examinees  to  do  the  task  being  tested.  These  two  uses  are  clearly 
related.  If  one  can  precisely  estimate  an  individual's  capability,  than 
forming  the  two  groups  is  not  a problem.  On  the  other  hand,  it  may  be 
possible  to  effectively  form  the  two  groups  without  getting  good  esti- 
mates of  "true"  capability. 

Several  psychometric  models  are  available  for  grouping  the  indi- 
viduals and/or  for  estimating  "true"  scores.  For  example,  one  may 
simply  calculate  the  proportion  of  items  correctly  answered  and  use  that 
proportion  as  an  estimate  of  "true”  capability.  Alternatively,  a binomial 
error  model  for  deriving  the  expression  for  the  regression  of  "true"  score 
on  observed  score  can  be  used  and  a "true"  score  calculated  for  each 
individual.  Other  possible  models  include  a Bayesian  Model  II  approach 
and  a latent  trait  model  such  as  the  Rasch  one  parameter  logistic  model. 
Each  of  these  models  yields  a somewhat  different  estimate  of  "true" 
capability  for  any  given  individual.  It  follows  that  the  makeup  of  the 
job  ability  groups  will  vary  from  model  to  model.  The  purpose  of  this 
research  is  to  empirically  study  the  models  referred  to  above.  What 
is  needed  is  an  appropriate  statistic  (or  statistics)  and  research 
design  for  comparing  each  model  against  all  others  given  the  same  test 
data. 


I.  INTRODUCTION.  The  purpose  of  this  paper  is  to  elaborate  on 
some  technical  details  and  to  highlight  specific  statistical  and 
research  problems  introduced  in  a previous  paper  by  one  of  the  authors 
(Epstein,  1975) . 

Epstein  described  four  procedures  for  estimating  true  scores  from 
observed  scores.  The  first  uses  the  observed  proportion  correct  as  an 
estimate  of  the  true  proportion  correct.  This  procedure  is ‘straight- 
forward and  familiar.  Hence,  discussion  of  it  will  be  reserved  until 
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the  problem  of  comparing  the  models  is  developed.  The  other  three  pro- 
cedures are  1)  a binomial  error  model,  2)  a Bayesian  model,  and  3)  the 
Besch  logistic  model.  Each  will  be  discussed  in  detail. 

2.  BINOMIAL  ERROR  MODEL.  The  binomial  error  model  (Lord  and 
Novlck,  1968,  pp.  508-529)  is  based  on  the  assumption  that  the  condi- 
tional distribution  of  observed  score  for  given  proportion  correct  true 
score  (T)  is  the  binomial  distribution. 

h(x|T)  ■ (5)  T*  (1-T)n"x 

x»0,l...n  is  the  number  of  correct  responses  observed  and  n.  is  the  total 
number  of  items  on  the  test. 

It  is  assumed  that  items  are  scored  dichotomoualy,  that  total  score 
for  an  examinee  is  the  numbs,  of  items  answered  correctly,  that  items 
are  locally  independent,  and  that  items  are  equally  difficult  for  a 
given  examinee. 

The  relationship  between  the  observed  score  distribution  and  the 
underlying  true  score  distribution  can  be  written  as  follows: 

$(x)  ■ (x)  Z1  g(T)  T*  (l-T)n-x  dT,  x-0,l,...n,  where  $(x)  is 
0 

the  distribution  of  observed  scores  and  g(T)  is  the  unknown  distribution 
of  true  scoreB. 

It  can  be  shown  that  if  the  regression  of  true  score  on  observed 
score  is  linear  than  the  distribution  of  observed  score,  symbolized  h(x) 
to  distinguish  this  special  case  from  the  general  case  d>  (x),  is 
negative  hypergeometric . 

h(x)  s b^nl  <-n>x  ^x  x ■ 0,1... n, 

' <a+b)tnl  <-b>x  xl 

where 


a and  b are  parameters  to  be  determined  and 
ptx]  = n(n-l) . . . (n-x+1) , 

(a)x  s a(a  + l)...(a  +x  -1),  nt°5  h U)q  5 1. 

The  parameters,  a and  b,  can  be  expressed  in  terms  of  moments  of  the 
observed  score  distribution 


a ■ (-l+l/aji)  ux 
b « -a-l+n/c^i 


°21®  « 
n-1 


1-  ^x(n-ux> 


n o' 


66 


.rrSli 


i 

i- 


§ 


The  discussion  thus  far  has  outlined  an  Internal  check  of  the 
appropriateness  of  this  model  for  any  given  data  set.  That  is,  If 
one  can  show  adequate  fit  to  the  negative  hypergeometrlc  distribution 
by  the  observed  scores  then  it  is  reasonable  to  continue  with  this 
model  assuming  linear  regression.  If  adequate  fit  is  not  obtained 
then  either  the  more  general  nonlinear  regression  approach  must  be  used 
or  alternative  models  must  be  identified. 

It  can  be  shown  that  if  the  observed  score  distribution  is  negative 
hypergeometrlc,  the  true  score  distribution  is  either  the  two  parameter 
beta  distribution,  or  some  other  distribution  having  identical  moments 
up  through  order  n.  In  either  case,  the  regression  of  true  score  on 
observed  score  is  given  by  the  linear  equation 

E (T|x)  ■ <>2ix  + (l-a21)ux  , x « 0,1,..  .n. 
n n 

3.  BAYESIAN  MODEL.  The  Bayesian  model  used  to  evaluate  these  data 
is  described  by  Lewis,  Wang,  and  Novick  (1973).  The  procedure  transforms 
the  binomial  test  score  data  via  an  arc  sine  transformation.  The  re- 
sulting score  is  assumed  to  be  a sample  from  a normal  population  with  its 
mean  value  at  the  individual's  transformed  true  ability.  Distributions 
for  the  prior  mean  and  variance  of  the  examinee  group's  transformed 
scores  are  specified  and  posterior  values  calculated.  Finally,  the 
posterior  marginal  distributions  for  the  transformed  scores  are  obtained 
and  estimates  of  individual  true  abilities  on  the  original  (proportion 
correct)  scale  are  calculated.  The  mathematical  details  are  outlined 
below. 

The  Freeman-Tukey  transformation  for  binomial  data  1b  used  in 
this  procedure: 

gj  - 1 sin-1  + sin-l-t/TiTl 

J 2 Vn+1  * xj  - l,2,...n  - the 

number  of  correct  responses.  The  g.  are  assumed  to  be  normally  dis- 
tributed with  mean  yj  - sin~l  /ii”  and  variance  v ■ (4n+2)“l,  where  yj 
is  the  transformed  value  of  the  £rue  proportion  of  correct  responses,  Flj, 
The  validity  of  the  assumption  of  normality  and  the  suitability  of  the 
transformation  for  the  procedures  to  follow  can  be  shown  to  be  adequate 
for  examinee  groups  of  at  least  IS  persons  and  for  tests  at  least  8 items 
long. 


The  set  of  transformed  variables,  Yj , is  assumed  to  be  a random 
sample  from  a normal  distribution  with  mean  up  and  variance  . Up  and 

$p  are  further  assumed  to  be  independent  and  to  have  a uniform  and  Inverse 
chi-square  distribution  respectively.  Explicit  expressions  for  the  prior 
and  posterior  density  functions  are  given  in  the  Lewis,  et  al.  paper. 
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The  desired  result  of  an  analysis  of  this  kind  is  the  marginal 
posterior  density  function  for  Yj  . Unfortunately,  an  explicit  ex- 
pression for  it  is  not  obtainable  from  the  joint  posterior  probability 
density  function  of  the  Yj  vector  given  the  gj  vector.  Lewis  et  al. 
show  methods  for  obtaining  the  marginal  tneansand  variances  for  the 
Yj  using  numerical  integration.  However,  they  indicate  that  for 
large  sample  sizen,  the  conditional  posterior  distribution  of  Yj  given 
and  the  gj  vector  provides  an  acceptable  approximation.  The  con- 
ditional approximation  was  used  for  the  analysis  of  the  data  reported 
In  the  Epstein  paper. 

The  conditional  distribution  of  Yj  given  4> j.  and  the  gj  vector  can 
be  shown  to  be  normal  with  mean  , 

E <YJ  l*i>  , g)  • 8.  + vg. 

s > 

♦ r + V 

and  variance 

var  (YjUr  . g)  * v(£r  + m_1v)  , 

fit 

$r  + v 

where 

J • l,2...m  ■ the  number  of  examinees, 
g ■ the  vector  of  transformed  scores,  and 
s» 

♦p  ■ the  mode  of  $p  given  g . 


can  be  obtained  by  solving  the  following  equation: 

<m  + v + 1)  <^3  [ (m  + 2 v + 3)  v - I (gj  - g.)2  - X)  ^ 

+ [ (v  + 2)  v2  - 2 X v]  - X v2  ■ 0 . 

In  the  above  equation,  v is  the  degrees  oi  freedom  for  the  prior 
inverse  chi-square  distribution  of  $p  . Lewis,  et  al.  recommend  that 
a value  of  eight  be  used  for  most  practical  applications.  X is  the 
scale  factor  for  the  inverse  chi-square  distribution.  It  can  be 
calculated  by  using  the  formula 

X - v - 2 
4<t+l) 
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where  t 1*  interpreted  as  the  number  of  teat  items  that  the  prior 
Information  is  considered  to  be  equivalent  to. 

Once  the  Ya  have  been  calculated,  the  lest  step  in  the  procedure 
is  to  ca.lculeteJrhe  estimates  for  the  true  proportion  correct.  This 
is  accomplished  by  applying  the  following  equation: 

n1  - (1  + 1 ) sin2y,  - 1_ 

2n  J 4n 

4 . RASCII  MODEL . The  Rasch  one  parameter  logistic  model  (Wright  and 
Panchapakesan,  1969)  assumes  that  the  observed  response  an±  of  person 
n to  item  i is  governed  by  a binomial  probability  function  of  person 
ability  Zn  and  item  easiness  . The  probability  of  a correct  response  is 


p (*ni  - 1)  - ZnEi 

The  probability  of  a wrong  response  is: 

P (a^  • 0)  - 1 - P (anl  - 1)  - _i 

i+znEi 


These  equations  may  be  combined  to  yield 

P <«nl>  " (W*ni  . 

l+ZnEi 

If  we  let  bn  - log  Zn  and  d1  - log  Ei  , 
then 

P (ani)  - exp  (ani(bn  + d*)) 

r+  exp  (bn  + d i ) 


The  number  of  correct  responses  to  a given  set  of  items  is  the  only 
information  needed  to  estimate  person  ability.  All  persons  who  get  the 
same  score  will  be  estimated  to  have  the  same  ability.  Hence,  in  terms 
of  score  groups, 

p <*ni>“  exp  (ani(bj  + d^) 

1 + exp  (bj+  d^) 


where  J * score  of  person  n,  and  all  persons  with  a score  j are  esti- 
mated to  have  the  same  probability  governing  their  responses  to  item  i. 


The  equations  obtained  when  the  condition  of  a maximum  likelihood 
is  satisfied  for  the  model  described  in  the  preceding  equation  are: 

k-i 

a+i  = E (r-jexp(bj*  + d**)/ (l+exp(bj  *+di*))  ) , i = 1,2,... k 

j 

k 

j - Z (exp(bj*  + di*) / (l+exp(bj*  + d±*))),  j = l,2,...k-l 


where  a+j[  = number  of  persons  who  get  item  i correct 

j = the  total  test  score,  an  ability  estimate  is 

^ obtained  for  each  score 

r j = number  of  persons  in  score  group  j . 

v 

bj  ,d.j*  = estimates  of  bj  and  d^ 

& 'fc 

The  method  consists  of  computing  d.^  and  bj  from  the  implicit  equations 
above.  The  equations  are  handled  as  two  independent  sets  and  solved 
accordingly. 

An  approximation  of  a standard  error  for  item  estimates  can  be 
obtained  by  assuming  that  the  variance  of  the  item  estimate  is  due 
primarily  to  the  uncertainty  in  the  item  score  a+j,.  To  a first 
approximation  this  gives: 

V(d-£*)  ~ ( 8d;[/ 3a-fi) 

which  leads  to: 

V(d^*)  = 1/E  (rjexp(bj*+di*)/ (1+exp (bj*  + d-j*))2). 

j 


The  major  contribution  to  the  error  variance  of  the  ability 
estimate  comes  from  the  variance  in  scores  produced  by  a given  indi- 
vidual. This  part  of  the  error  variance  depends  upon  the  number  of 
items  and  their  easiness  range. 


An  approximation  of  the  variance  of  the  abili  y estimate  b*  is 
given  by 

V*(b*)  = ( L/C(b*)exp(b*)}  + { i/C2 (b*) } 


• >:  (v(d  j_)  { cxp(d  j)  / (l-toxp  (clj/l-b '"))  2 }2) 
i 

k , 

where  C(b'  ) 'L  (e:<p(d  j ) / ( 1+exp (b*+di)  ) Z)  , 


Best  Available  Copy 


V ( f 1 i ) is  : Im  variance  of  the  item  calibration  dj_. 
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The  first  term  in  the  denominator  of  the  V*(b*)  equation  is  due  to  the 
variance  in  the  score,  and  the  second  term  is  due  to  the  imprecision 
of  item  calibration.  The  first  term  is  always  larger  than  the  second. 

5.  DISCUSSION  OF  THE  PROBLEM.  One  characteristic  of  a useful  model  is 
Chat  it  has  a small  error  of  measurement.  That  is.  the  distribution  of 
estimated  scores  for  a given  true  score  is  closely  clustered  around  the 
true  score.  The  extent  of  the  measurement  error  that  can  be  expected 
with  a given  model  is  dependent  on  the  variance  of  the  estimated  true 
score.  For  example,  in  the  proportion  correct  model,  the  variance  of 
the  estimated  true  proportion  correct  is  equal  to  p(l-p)/n.  In  this 
case  the  variance  of  the  estimate  will  decrease  as  the  number  of  obser- 
vations increases.  Thus  it  would  seem  that  any  level  of  precision  could 
be  obtained  by  simply  adding  observations.  Unfortunately,  for  the  number 
of  Items  that  are  usually  practical  on  a test,  the  level  of  precision 
possible  is  not  completely  satisfactory.  It  would  be  useful  to  compare 
the  variance  of  the  true  score  estimates  obtained  with  the  other  models 
to  the  proportion  correct  model. 


Therefore  the  question  of  how  to  derive  an  expression  for  the 
variance  of  the  estimated  true  scores  for  the  other  models  must  be 
addressed.  An  expression  for  the  binomial  error  model  has  been  derived. 
Since  the  binomial  error  model  results  in  a regression  equation  it  seems 
reasonable  to  base  the  derivation  on  the  general  form  of  the  error  of 
estimation,  2 -2  The  rat*°  the  variance  of  true 


XT 


scores  to  the  variance  of  observed  scores  equals  the  reliability  co- 
efficient, o2  where  u is  the  variance  of  the  true  number 

_£_  ■ a21  » c 
o2 
x 

correct.  Since  the  true  number  correct  equals  the  true  proportion 

correct  times  the  number  of  items,  C - nT,  one  may  write  o2  . n2  a2  . 

. . c T 

Substituting,  ■ a*  »2l/n^  • The  reliability  of  a test  equals 


the  square  of  the  correlation  between  true  and  observed  scores,  a 21 
Hence,  the  variance  of  the  estimated  true  score  can  be  written 


o2 

G 


qx  tt21  (1  - ”20 
n2 


For  the  Bayesian  and  Ranch  models  expressions  for  the  variances 
of  the  estimated  true  scores  were  not  derived.  In  the  case  of  the 
Bayesian  model  the  output  Is  in  terms  of  the  arc  sine  of  the  true  pro- 
portion correct.  While  the  sampling  distribution  of  the  transformed 
variable  is  known,  the  variance  of  the  estimated  true  proportion  correct 
itself  was  not  determined.  A aimilar  problem  exists  for  the  Rascti  model. 
The  sampling  distributions  of  the  ability  and  item  difficulty  indices 


i 

! 
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•re  known  as  well  as  Che  explicit  equation  for  calculating  the  proportion 
correct  from  those  values.  But  an  expression  for  the  estimated  true  pro- 
portion correct  has  not  been  derived.  In  short,  the  problems  are: 

(1)  For  the  Bayesian  model,  given  the  variance  of  oj  and  the  equation 

Hj  ■ (1  + l/2n)  sin*  Yj  - l/4n,  what  is  the  variance  of  Ilj  } and 

(2)  For  the  Rasch  model  siven  the  variances  of  b*  and  d*  and  the  equation 

p (correct)  - e_xp(b»  + d») what  is  the  varianco  of  p? 

1 + exp  (b*  + d*)  , 

As  a result  of  the  discussion  during  the  session  a solution  to  the 
above  mathematical  problems  seems  to  be  available.  It  wsb  pointed  out  f 

that  methods  exist  for  deriving  standard  errors  of  functions  of  random 
variables.  One  promising  approach  outlined  in  Kendall  and  Stuart  (1969, 
p.  231)  Involves  evaluating  terms  of  a Taylor  expansion.  Using  the 
Kendall  and  Stuart  procedure  it  should  be  possible  to  derive  expressions 
for  the  standard  error  of  measurement  for  each  of  the  models.  This  will 
allow  for  formal  comparison  of  the  models  without  real  or  simulated  data. 

The  discussion  then  considered  whether  it  was  possible  to  compare 
the  models  by  obtaining  an  estimate  of  "true  score"  and  comparing  it  to 
the  "real"  true  score.  The  problem  lies  in  obtaining  an  acceptable 
true  score.  Three  approaches  were  considered  and  are  expected  to  pro- 
vide a basis  for  future  research.  The  first  is  to  base  model  compari- 
sons on  Monte  Carlo  simulation  studies.  Monte  Carlo  studies  provide 
an  unambiguous  true  score  but  suffer  from  their  lack  of  generalizability 
to  practical  applications.  A second  approach  is  to  define  true  score 
as  the  score  obtained  on  an  Instrument  consisting  of  a large  number  of 
items.  The  models  would  then  be  used  to  estimate  the  true  score  using 
a smaller  and  more  realistic  number  of  items.  This  approach  is  em- 
pirical and  more  directly  oriented  to  practical  applications  whore 
testing  time  and  the  number  of  items  that  may  be  Included  In  an  Instru- 
ment are  limited.  Although  this  approach  suffers  from  the  fact  that 
the  defined  true  score  Is  not  error  free,  the  amount  of  error  is  not 
likely  to  be  significant  for  practical  purposes.  The  third  approach 
would  investigate  the  possibility  of  applying  Geisser's  predictive 
sample  reuse  method  (Celnser,  1975)  to  the  comparison  of  the  modeLs. 

Geisser's  method  may  provide  a more  formal  empirical  approach  to 
model  comparison  than  the  second  approach  discussed  above,  however, 
it  has  not  been'  determined  whether  or  nut  It  is  applicable  to  this 
research. 

Four  models  for  estimating  true  scores  were  presented  and 
methods  for  comparing  their  outputs  wore  discussed.  Procedures  for 
comparing  the  statistical  properties  of  the  models  are  available  and 
relatively  straightforward.  Future  research  will  be  concerned  with 
establishing  the  empirical  validity  of  the  models  and  their  applica- 
bility to  solving  practical  measurement  problems. 
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NON-RANDOMI ZED  FACTORIAL  DESIGNS  CHARACTERIZED  BY  TREND 
ELIMINATION  AND  A MINIMUM  NUMBER  OF  FACTOR  LEVEL  CHANGES 

Lea  Lancaster  and  Steve  Reynolds 
U.S.  Army  Operational  Test  and  Evaluation  Agency 
Falls  Church,  Virginia 

ABSTRACT.  An  admissible  set  of  run  orders  is  developed  for  2^ 
factorial  designs  restricted  to  trend  elimination.  The  best  design  is  then 
selected  from  this  admissible  set  having  the  minimum  number  of  factor  level 
changes.  The  procedure  is  developed  for  p»5  where  admissible  sets  are  gen- 
erated between  various  mixtures  of  linear,  quadratic,  and  cubic  trend 
elimination  and  main  effects,  first  order  interactions,  and  second  order 
interactions.  The  number  of  factor  level  changes  is  used  to  generate  the 
admissible  set. 

1.  INTRODUCTION.  The  design  of  two-level  factorial  experiments  robust 
against  time  trends  will  be  illustrated  in  this  paper.  In  fact  designs  with 
zero  time  trends  will  be  displayed  that  al'io  keep  the  number  of  factor  level 
changes  form  run  to  run  small.  Both  of  these  features  are  essential  in 
operational  testing  due  to  resource  problems.  Operational  cost  effectiveness 
is  achieved  by  minimising  the  number  of  factor  level  changes.  Soldier  learn- 
ing and  selection  is  controlled  by  an  elimination  of  time  trends  in  the 
experimental  designs.  Thus,  these  designs  are  characterized  by  specifying 
the  run  orders  prior  to  running  the  tests.  A combinatorial  technique  is 
developed  for  generating  these  desirable  designs. 

In  the  planning  of  an  experiment  costs  can  be  reduced  by  a multi-phase 
design.  The  first  phase  would  be  the  design  of  all  controllable  factors  at 
their  low  and  high  levels.  Additional  phases  would  be  adaptive.  That  la, 
the  results  of  the  first  phase  would  be  decisive  for  determining  the  design 
for  the  additional  phases.  Thus,  forcing  the  complex  overall  design  to  be 
developed  in  the  real  time  mode.  However,  the  possible  options  at  each 
phase  are  planned  and  designed  a priori  and  the  results  of  the  previous 
phase  trigger  the  design  decisions  for  the  next  phase.  This  report  will  be 
concerned  with  the  first  phase  where  p factors  are  varied,  each  at  two  levels. 

A method  for  the  selection  of  run  orders  spaced  at  equal  time  intervals 
is  developed  whereby  a subset  of  possible  or  admissible  run  order  choices 
is  restricted  to  trend  elimination.  The  designer  then  has  the  option  to 
randomize  on  this  admissible  set  or  else  he  can  select  the  run  order  with  a 
minimum  number  of  factor  level  changes.  With  respect  to  trend  elimination 
Figure  1 summarizes  seven  admissible  subsets  which  will  be  studied  In  Chapter  6. 
However,  cases  two  and  three  admit  empty  sets  and  are  Included  for  academic 
purposes. 
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Highest  Restriction  On 


Main 

1st  Order 

Effects 

Interactions 

2nd  Order  Inter 

In  Figure  1 the  following  notation  is  used: 


L ■ Linear 

Q *»  Quadratic  and  linear 

C *=>  Cubic,  quadratic,  and  linear 

The  different  cases  can  be  expressed  in  vector  notation  by  writing  each 
case  as  (i,  j , k) . For  example,  case  5 can  be  expressed  as  (Q,  L - ). 
Utilizing  this  notation,  the  coordinate  denotes  where  the  restriction  is  to 
be  placed  and  the  coordinate  value  deontes  the  type  of  restriction.  This  will 
become  clearer  in  Chapter  6. 


The  options  left  to  the  test  designer  for  each  of  the  cases  are  very 
flexible.  In  certain  situations  the  choice  for  a run  order  may  be  dictated 
by  other  criteria  such  as  engineering  judgement  with  respect  to  some 
of  the  factor  interactions.  For  example,  some  of  the  factor  interactions 
or  treatment  combinations  may  be  null  or  of  no  importance  to  the  experimenter. 
For  these  situations  the  chosen  run  order  can  have  a smaller  number  of  factor 
level  changes  as  a tradeoff  for  a higher  time  trend  for  the  null  treatment 
combinations. 


The  developed  method  is  an  alternative  to  full  randomization.  Some 
experimenters  often  use  blocks  to  gain  sensitivity  at  the  expense  of  full 
randomization  by  reducing  time  trends  to  an  average  variation  within  blocks. 
However,  if  the  blocks  contain  many  runs,  then  the  average  trend  within  a 
block  may  still  cause  a disturbing  effect.  In  the  developed  method  random- 
ization is  restricted  to  the  admissible  set  of  runs  whereby  a price  tag  can 
even  be  attached  to  each  ordered  sequence  of  runs  in  the  admissible  set. 
Selection  is  then  based  on  the  set  with  the  total  number  of  factor  level 
changes  minimized.  Procedures  for  partial  randomization  with  respect  to 
equivalence  classes  is  left  as  an  option  to  the  designer. 

2.  REVIEW  OF  PERTIHF.NT  LITERATURE.  In  this  paper  admissible  sets  are 
restricted  to  zero  time  trends  where  the  optimal  run  order  is  chosen  which 
has  a minimum  number  of  factor  level  charges.  Other  work  has  restricted  to 
admissible  sets  having  the  minimum  number  of  factor  level  changes  where  the 
optimal  run  order  is  chosen  which  has  a minimum  (non-zero)  simple  or  multiple 
correlation  with  time.  In  this  paper  the  admissible  sets  have  zero  simple 
and  multiple  correlations  with  time.  Thus  far  in  the  literature  and 
including  this  paper  only  two-level  factors  have  been  studied. 

Addelman  (1)  briefly  su.  rar.izes  the  state-of-the-art  up  to  March  1972. 

Daniel  and  Wilcoxon  (7.)  analyze  full  fractional  factorial  designs  with 
respect  to  linear  and  quadratic  time  trends.  Their  approach  is  extended 
in  this  paper.  They  do  not  consider  factor  level  changes  in  their  run 
orders. 

Draper  and  Stoneman  (4)  were  the  first  to  consider  the  tradeoff  between 
factor  level  charges  and  linear  tine  ircr-ds.  However,  they  look  mostly 
at  the  combinatorial;;  and  it  appear;;  that  they  use  search  techniques  to 
display,  their  run  orders. 
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TJ.ahrt  and  Weeks  (6)  consider  the  selection  of  run  orders  with  respect 
to  factor  level  changes  plus  randomization  on  equivalence  classes. 

Dickinson  (3)  restricts  to  the  minimum  number  of  factor  level  changes 
and  then  selects  his  run  orders  having  nininum  simple  and  multiple 
correlations  with  linear  time  trends.  He  uses  a computer  search  technique 
to  find  a few  of  the  many  possible  run  orders. 

Thomas  (5)  considers  run  orders  with  the  minimum  number  of  factor  level 
changes  and  applies  the  procedure  to  sensitivity  analysis  of  parameters  in 
large  scale  deterministic  computer  models. 

3.  METHOD  OF  DESIGN  SELECTION.  The  method  will  be  illustrated  by 
application  to  a 2^>  factorial  design  With  N =»  32  runs.  That  is,  a full 
factorial  design.  The  extension  to  designs  with  p > 5 will  be  obvious 
from  the  illustration. 

A 2?  factorial  design  is  characterized  by  N «*  2P  runs  of  p factors;  with 
each  factor  at  two  levels.  For  p = 5,  Figure  2 displays  the  design  matrix 
of  + l’s  (l'.s  are  omitted  for  ease  of  typing)  in  standard  Yates  notation 
for  the  32  runs  and  the  32  treatment  combinations  where  "T"  denotes  th*’ 
total  treatment  combination  which  is  omitted  in  the  selection  criterion. 

The  Yates  algorithm  will  be  used  for  computing  polynomial  trend  of 
factors  at  two  levels.  Daniel  and  Wilcoxen  (reference  1)  have  applied  the 
Yates  algorithm  to  the  integer  linear  and  quadratic  Tchebycheff 
orthogonal  polynomials  given  in  Figure  3.  The  Yates  solution  is  equivalent 
to  performing  the  matrix  product  between  the  design  matrix  (plus  and  minus 
ones  as  given  by  Figure  2)  and  the  polynomial  vector.  The  Yates  solution 
is  much  faster  than  the  matrix  product.  The  Daniel-Wilcoxon  procedure  is 
applied  here  where  we  extend  up  to  the  (p-2)th  order  of  the  polynomial. 
Further,  the  method  developed  in  this  paper  will  take  into  account  the 
number  of  factor  level  changes.  In  fact,  it  turns  out  that  the  number 
of  factor  level  changes  for  each  factor  characterizes  and  complements  the 
standard  Yates  design. 

Xn  Figure  3 only  the  first  It)  numbers  are  arrayed.  The  second  set 
of  16  numbers  is  found  by  reflecting  each  column  downward  and  reversing 
the  sign  for  the  linear  and  cubic  column.  For  example,  the  32nd  number 
for  each  column  will  be  -31,  155,  ar.d  P.99. 

For  p » 5,  Figure  4 gives  the  Yates  solution  performed  on  the  Tchebychcf 
orthogonal  polynomials  (Figure  3)  up  to  the  third  order.  In  Figure  4 
the  ordering  of  the  treatment  combinations  has  been  changed  from  the 
standard  Yates  ordering  to  a more  convenient  ordering  for  the.  method  to  be 
developed  in  this  paper.  it.  turns  out  that  this  new  ordering  groups  the 
various  typer,  of  treatments  with  either  sets  of  zeros  or  sets  of  non-zeros. 
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FIGURE  2.  Standard  Yatas  Notation  for  The  Design  Matrix  for  32  Rune 
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FIGURE  3.  Orthogonal  Polynomial* 
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The  factor  level  changes  are  also  given  In  Figure  A.  Note  that  the 
number  of  factor  level  changes  vary  from  1 to  31.  The  main  effect  for  A 
has  the  maximum  number  of  factor  level  changes.  For  determining  the  number 
of  factor  level  changes  for  any  design  only  the  level  changes  for  the  main 
effects  ere  summed.  Therefore)  the  standard  Yates  design  is  characterised 
by  57  factor  level  changes.  Thus,  at  the  references  show,  the  standard  Yates 
design  la  undesirable  with  respect  to  factor  level  changes.  Also,  the 
standard  Yates  design  has  large  correlations  with  time,  again  an  undesirable 
characteristic.  Thus,  optimal  designs  will  be  found  in  this  paper  having 
admissible  properties. 

The  time  counts  for  eech  treatment  are  the  tame  as  the  Yates  solution 
given  in  Figure  A.  Note  that  for  tha  standard  Yates  design  the  main  effects 
have  zero  quadratic  time  trend.  The  first  order  treatment  interactions 
have  zero  linear  and  zero  cubic  time  trend.  The  second  order  treatment 
interactions  have  non-zero  cubic  time  trend.  The  third  order  treatment 
interactions  have  all  zero  time  trend.  These  observations  are  utilized  to 
construct  admissible  run  orders  for  the  cases  given  in  Figure  1. 

The  method  consists  of  developing  a new  algebra  whereby  each  of  the 
31  treatments  la  denoted  by  the  number  of  factor  level  changes.  In  effect 
the  new  algebra  permutes  the  31  columns  of  Figure  2 into  nn  optimal  design. 

In  the  next  section  tha  development  will  be  presented  vie  illustration. 

In  Chapter  6 admissible  sets  of  run  orders  for  various  cases  will 
ba  constructed.  In  these  cases  whenever  the  designer  has  the  option  to 
vandotalzo,  it  is  to  be  understood  that  he  can  also  randomize  with  respect 
to  two  equivalence  classes. 

One  equivalence  class  is  defined  on  the  factor  names.  That  is,  the  names 
(for  example,  A,  B,  C,  D,  or  E)  can  be  chosen  at  random  for  the  adnis9lbla 
set.  There  aro  pi  elements  in  tills  equivalence  class. 

A second  equivalence  class  is  defined  on  thu  choices  of  the  high  and 
low  levels  for  one  or  more  factors.  That  ie,  the  designer  can  choose  the 
pluo  and  minus  r.igns  for  each  rain  effect  at  random.  There  arc  N elements 
In  this  equivalence  class. 
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4*  ALOE BRA.  Multiplication  of  any  two  of  the  31  treatments  defined 
by  Figure  2 entail*  pariwiee  multiplication  of  the  32  elements  making 
up  each  of  the  columns  of  Figure  2.  The  classical  method  of  multi- 
plication will  be  utilised,  whereby  numbers,  rather  than  letters,  will 
b«  used  to  denote  the  treatment  names.  These  numbers  are  the  number 
of  factor  level  changes  for  that  particular  treatment.  That  is,  in 
Figure  4 Instead  of  denoting  the  treatments  by  column  one,  column  two 
will  be  used  to  denote  the  treatments  as  assigned  by  the  standard 
Yates  notation.  As  an  example,  the  classical  multiplication  given  as 
follows: 

AC  * ABD  - BCD 

Is  represented  in  the  new  algebra  as  follows: 

24  * 19  - 11 

Note  that  this  triplet  can  be  rapresanted  in  three  different  ways 
as  follows: 


CD 

24 

* 

19 

- 11 

(ii) 

19 

* 

11 

- 24 

(iii) 

24 

* 

11 

- 19 

Figure  5 displays  tha  155  possible  unique  triplets  as  representation 
(ill)  in  a two-way  table.  To  read  off  any  product  from  Figure  5, 
note  that  the  maximum  value  Is  tha  row,  the  minimum  value  is  tha  column, 
•>»ul  the  value  In  between  is  the  element  of  the  matrix  or  body  of  the 
table.  In  Figure  5 all  (J1)  or  465  different  triplets  could  have  been 
displayed  by  filling  in  the  blanks.  Howsver,  by  filling  in  only  re- 
presentation (ill)  as  deflnsd  above  a pattern  emerges.  On  extension 
to  higher  level  designs,  this  pattern  can  be  taken  into  account  in 
developing  a recursive  method. 


figure  5.  The  155  Possible  Multiplications 
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$»  SIEVE.  In  order  to  generate  opt loal  or  admissible  daalgna 
tha  procedure  antalla  development  and  utilisation  of  a technique  which 
■hall  be  called  a sieve.  Tha  first  step  of  tha  slave  is  formed  by 
displaying  tha  information  from  Figure  3 in  Figure  C for  all  465  pos- 
sible triplets.  In  Figure  6 each  one  of  tha  31  treatments  Is  determined 
by  any  one  of  tha  corresponding  15  pairs*  That  is,  the  pairs  era 
choices  for  tha  two  main  affects  A*  and  I*  and  tha  product  is  tha 
choice  for  tha  treatment  AB*.  Tha  superscript  * denotes  tha  treat- 
manta  belonging  to  a possible  candidate  for  an  optimal  or  admissible 
design.  Further,  in  Figure  6,  tha  symbols  , '*L",  "Q” , or  "C"  are 
taken  from  Figures  1 and  4 and  displayed  ne  an  aid  for  sifting  out 
the  desired  restrictions  for  the  various  cases  of  Figure  1.  The 
idea  is  to  sequentially  search  down  each  of  the  31  blocks  of  Figure  6 
and  sift  out  the  desired  candidates  for  an  admissible  design.  After 
this  first  step  of  the  sieve,  tha  designer  will  have  possible  candidates 
for  A*,  B*,  and  AB*. 
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Figure  6.  Choices 
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The  second  step  of  ths  sieve  Is  concerned  with  finding  the  main 
effect  C*  given  candidates  A*,  B*,  end  AB*.  Since  the  main  effects 
can  be  relabeled  with  respect  to  equivalence  classes,  the  choice  for 
C*  can  be  subjected  to  the  following  constraint t 

A*  < B*  < C* 

Now  to  choose  C *,  suppose  that  A*  and  B*  are  fixed  at  "S'*  and 
"9"  respectively,  then,  for  this  example,  Figure  7 displays  28  possible 
choices  for  C*.  In  Figure  7,  for  any  choice  of  C*,  the  remaining 
three  treatments  in  that  same  row  are  automatically  determined  and 
assigned  as  shown  in  Figure  8,  for  example,  for  the  second  row  of 
Figure  7.  That  is , the  treatments  in  each  row  of  Figure  7 for  C* 
cen  be  permuted,  but  only  these  seven  rows  can  be  defined. 

j 

i 

i 
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la  applying  the  sieve,  tha  last  two  rows  of  Figure  8 can  ba  crossed 
out,  for  tha  example,  due  to  the  ordering  constraint  on  these  three 
candidates  for  ths  main  effects.  This  ordering  constraint  will  also 
redoes  the  set  of  choices  given  in  Figure  7.  Casa  restrictions  will 
further  reduce  the  set  of  choices.  Therefore,  as  the  sequential  search 
for  candidates  progresses,  or  as  A*  and  B*  increase  in  value,  the  set 
of  possible  choices  for  each  new  C*  decreases.  Usually,  tha  possiblllitlea 
need  not  ba  exhaustive  as  shown  by  tha  cases  studied  in  Chapter  6. 

At  this  stage  of  the  sieve,  for  each  possible  candidate  for  an 
admissible  design,  it  turns  out  that  seven  out  of  the  31  possible 
treatments  are  now  fixed.  Tha  third  step  of  ths  slave  is  concerned 
with  finding  admissible  choices  for  D*  and  E*.  To  continue  the  sequan- 
tlal  search,  the  ordering  constraint  is  extended  as  follows: 

A*  < B*  < C*  < D*  < E* 

Suppose  that  the  candidate  under  consideration  at  this  step  is  given 
by  the  first  row  of  Figure  8.  The  new  candidates  will  be  found  from 
tha  blocks  of  Figure  6.  For  this  example,  the  bast  candidate  for  D* 
la  "13".  Further,  on  checking  the  13th  block  of  Figure  6 and  crossing 
out  tha  seven  pairs  corresponding  to  the  seven  fixed  treatments,  the 
best  candidate  for  E*  is  "16".  These  two  candidate  blocks  are  repeated 
from  Figure  6 as  Figure  9 but  without  any  case  restrictions.  Also  in 
Figure  9 the  seven  treatments  for  this  example  are  circled.  As  a check 
on  tha  validity  of  tha  chosen  design,  note  that  in  Figure  9,  each 
block  has  seven  pairs  that  are  eliminated.  Case  restrictions  would 
eliminate  more  pairs.  Due  to  tha  ordering  constraint  and  since  the  sum 
of  the  factor  level  changes  for  the  Min  effects  is  to  ba  minimised, 
only  one  pair  of  D*  and  B*  treatments  need  be  found  for  each  candidate 
up  to  this  step  of  the  slave.  However,  the  three  Min  effects  from 
step  2 will  not  have  a sum  that  strictly  increases  or  decreases  as 
the  sequential  search  progresses. 

After  all  admissible  designs  are  sufficiently  searched  and  dis- 
played the  designer  selects  the  optlMl  design  with  respect  to  the 
particular  case  under  consideration.  However,  due  to  the  ordering 
criterion  and  tha  fixed  choice  of  ths  plus  and  minus  signs  in  Figure  2, 
the  above  selection  is  up  to  an  equivalence  class.  Therefore,  at  this 
point,  the  designer  has  the  option  to  randomize. 


In  order  to  bo  absolutely  euro  that  the  selected  design  is  e 
valid  design,  the  plus  end  minus  eigne  of  the  main  effects  can  be  placed 
back  through  the  standard  Yates  notation  via  the  factor  level  changes 
as  shown  in  figure  10.  In  Figure  10  the  design  to  be  validated  is  given 
by  the  last  rov  while  the  next  to  last  row  is  the  corresponding  Yates 
notation  from  figure  2.  Rare,  a plus  sign  denotes  a value  of  one  and 
a minus  sign  denotes  a value  of  sero.  Thus,  the  Yates  count  ia  deter- 
mined by  writing  the  binary  count  of  the  five  digit  number  of  each 
row  plus  one.  The  Yetea  count  for  a valid  design  should  include  all 
numbers  from  1 to  32. 
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6.  CASE  STUDY.  Figure  1 summarises  seven  cases  with  various  tine  trend 
restrictions.  Figure  11  shove  how  these  cases  or  sets  are  Included  in 
each  other.  The  case  represented  by  (L,  -)  has  a large  number  of 

elements  or  edmissible  designs  as  well  as  the  case  with  no  restrictions. 
Therefore , these  two  cases  will  not  be  analysed  but  are  shown  in  Figure  11 
to  complete  the  picture*  As  mere  restrictions  are  placed  on  the  design, 
or  as  more  arrows  in  Figure  11  are  traced,  the  total  number  of  factor 
level  changes  increases  and  the  trade-off  becomes  a managerial  decision. 

Note  that  Figure  11  is  not  drawn  to  any  scale. 


FIGURE  11.  Inclusion  of  Casas 
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The  logic  for  generating  the  admissible  sets  for  the  various  cases 
has  been  programed  in  FORTRAN.  Table  look-ups , "IF"  statements,  and  "DO" 
loops  simulate  the  sieve,  the  order  constraints,  and  the  restrictions 
and  drive  tha  sequential  aarch. 

CASE  1,  (L.  L.  L).  For  this  case  the  5 treatments  denoted  "-**  in 

Figure  6 must  be  designated  as  third  or  fourth  order  treatments.  There- 
fore, up  to  an  equivalence  class,  this  set  could  have,  at  the  most,  € 
admissible  designs.  If  4 of  the  5 possible  third  order  interactions 
(treatments)  are  fixed  then  the  fifth  one  is  determined.  Therefore,  there 
are  only  3 admissible  designs  and  these  five  designs  are  displayed  in  Figure  12 
In  Figure  12  the  5 admissible  designs  eve  generated  as  follows.  The  first 
4 treatments  are  fixed,  thus  determining  the  next  11  treatments.  The 
treatments  in  line  number  16  are  fixed  next,  thus  determining  the  rest  of 
the  treatments.  The  sum  given  in  the  last  row  characterises  each  design 
and  is  found  by  adding  the  factor  level  changes  or  the  values  denoting 
the  5 main  effects. 
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CASS  2.  (Q.  Q.-).  This  csss  admits  an  empty  set  as  shown  as  follows. 

Utilizing  tha  first  step  of  the  salve,  Figure  13  arrays  the  possible  candidates 
as  given  by  Figure  6 where  each  treatment  of  the  triplet  haa  an  assigned 
Q or  C.  Using  the  ordering  constraint,  these  triplets  have  bean  ordered 
in  Figure  13.  However,  this  ordering  can  be  reversed  if  necessary.  But 
the  second  step  of  the  seive  cannot  be  filled,  since  6 of  the  7 required 
treatments  for  each  candidata  at  this  step  must  be  taken  from  Figure  13. 

Thus  admitting  an  empty  set. 

FIGURE  13,  Candidates  for  Case  2 from  Step  1 of  the  Sieve 
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CASE  3,  (C.  L.  This  case  also  admits  an  empty  set.  This  can 
be  shown  in  a similar  fashion  as  shown  in  case  2 or  by  looking  at  the 
5 treatments  making  up  case  4 and  putting  on  the  further  restriction  on 
the  first  order  Interactions.  To  repeat  the  proof  from  case  2, 

Figure  14  arrays  tha  possible  candidates  from  the  first  stop  of  the 
sieve.  Note  that  in  Figure  14  there  are  only  5 possible  candidates  for 
tha  main  effects  and  the  following  product  violates  any  possible  designs: 

10  * 18  * 20  * 22  . 26 

That  is,  ABCD*  and  E*  must  be  different.  Thus  showing  that  the  set  for- 
esee 3 is  also  empty. 
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FICU1E  14,  Candidates  for  Cat*  3 from  Stop  1 of  Tha  Slava 
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CASE  4,  (C.  -«-) « For  this  eaaa  Flgura  15  arrays  tha  possible 

candidates  from  tha  first  step  of  tha  slave.  Hare  there  are  only  6 
possible  candidates  for  tha  main  affects,  but  one  of  these  is  inadmissible 
due  to  the  following  product  violation: 

10  * IS  * 26  - 2 

20  * 22  - 2 

This  product  violation  is  found  an  execution  of  steps  2 and  3 of  tha  sieve. 
Figure  16  arrays  the  sain  effects  and  the  first  order  interactions  for 
the  5 admissible  designs  for  this  case  along  with  the  sum  of  the  factor 
level  changes.  Figure  16  also  shove  that  tha  set  for  case  3 is  empty,  since 
each  design  has  at  least  one  first  order  interaction  that  violates  the 
further  restriction  imposed  by  going  from  case  4 to  case  3. 


Case  5.  (Q.  I,  -) . This  case  admits  a vary  large  sat  of  admlaalbla 

designs.  Figure  17  displays  soma  of  those  designs  which  were  generated 
in  a fraction  of  a second  on  the  Univac  1108  computer  along  with  the 

total  sum  of  factor  level  changes.  The  desiana  with  sum*  than 
70  were  chosen  to  Illustrate  the  possibilities. 

Figure  17.  Some  Possible  Designs  for  Case  5 
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Case  6.  (L,  L,  -).  This  case  also  admits  a very  large  set  of 
admissible  designs,  a set  much  larger  than  the  set  for  case  5.  Figure  18 
displays  some  of  these  deslgna  which  were  again  generated  in  a fraction 
of  a second  on  the  Unlvar  1108  computer.  The  designs  with  sums  less 
than  56  were  chosen  to  illustrate  the  possibilities.  The  design  with 
a sun  of  43  is  optimal.  For  comparitlve  purposes  the  standard  Yates 
design  has  a sum  of  57  plus  non-zero  time  counts  in  the  main  effects. 
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Figure  18.  Sou*  Possible  Dasigns  for  Cue  6 
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Case  7.  (Q.  -).  Tills  csss  Is  indudsd  for  comparison  purposes.  3 

Although  It's  much  lsrgsr  than  essss  4 sad  5.  It  turns  out  thst  It  . 
has  ths  bum  optimal  Assign  ss  east  5 sa  glvan  by  ths  first  dsslgn 
of  Tlgura  17. 

I 

To  oonpara  thass  casts  further,  ths  optimal  dsaign  for  ths  oass 
expressed  by  (L,  -,  -)  is  givsn  ss  (2,  4,  5,  8,  16)  with  a sum  of  35. 
furthsr,  ths  cats  or  sst  of  Assigns  having  no  rsstrictions  is  givsn  as 
<1,  2,  4,  8,  16)  with  a sum  of  31  or  N-l  as  shown  by  ths  rafarancaa. 

However,  on  rsstrlctlng  to  ths  standard  Yates  notation,  as  this  pspsr 
has  dons,  this  is  ths  only  possibls  dsaign  up  to  an  aquivalsncs  class,  j 

with  a sum  of  31.  On  rslaxing  ths  standard  Yatsa  rsstrlction,  as  ths 
rsfsrsncss  do,  many  designs  can  bs  found  with  a sum  of  31,  but  with 
non-zero  time  counts. 
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7.  APPLICATIONS.  The  application  of  the  techniques  presented  in 
tills  paper  to  operational  testing  can  best  be  shown  by  giving  an  example. 
For  that  purpose,  an  experimental  design  for  an  operational  test  of  the 
hypothetical  ZAP  anti-tank  weapon  will  be  constructed. 

After  analysis  of  the  system  to  be  tested,  five  factors  are  chosen 
to  be  included  in  the  design,  each  factor  being  taken  at  two  levels, 
thus  giving  a 2 5 factorial  experiment.  The  factors  chosen  and  their 
associated  levels  are  shown  in  Figure  19. 

The  Importance  of  eliminating  time  trends  in  such  a test  can  easily 
be  seen.  With  so  few  factors  being  controlled,  there  exist  the  possi- 
bility that  some  uncontrolled  and  unmeasured  factor  is  influencing  test 
results.  Such  factors  as  weather,  crew  learning,  and  crew  morale  can, 
and  usually  do,  change  with  time  through  the  test. 

Another  consideration  in  designing  this  test  is  the  ease  of  execu- 
tion cf  the  design.  Quite  often  a penalty  must  be  paid  in  time,  money, 
and  perhaps  test  validity  for  each  factor  level  change  which  is  made. 

For  instance,  changing  the  vlsablllty  factor  between  day  and  night  too 
often  would  greatly  slow  the  test  execution  and  destroy  any  attempt 
to  portray  a realistic  combat  scenario,  as  it  would  permit  only  a small 
number  of  firings  during  daylight  and  then  delay  further  tasting  until 
night  in  order  to  achieve  the  desired  factor  level  change.  Similarly 
It  may  be  difiicult  and  time  consuming  to  frequently  move  the  test  part- 
icipants and  test  team  from  one  location  to  another  in  order  to  achieve 
changes  in  the  terrain  factor.  As  a third  example,  frequent  changes 
in  the  weapon  factor  may  confuse  the  test  participant  and  prevent  him 
fron  performing  as  well  as  he  might  if  he  were  allowed  to  Btay  with 
one  weapon.  For  example,  one  weapon  nay  require  the  aoldler  to  lead 
a moving  target  while  the  other  weapon  does  not.  If  the  teat  participant 
is  frequently  switching  back  and  forth,  he  may  forget  and  lead  when 
he  should  not  or  not  lead  when  he  should.  Even  if  he  does  remember 
and  does  the  right  thing,  he  may  not  do  It  as  proficiently  as  if  he 
had  been  able  to  concentrate  on  developing  a single  skill  instead  of 
two. 


With  the  foregoing  constraints  in  mind,  we  can  use  the  techniques 
presented  In  this  paper  to  design  a good  test  of  our  hypothetical  anti- 
tank system. 


If  it  is  felt  desirable  to  strongly  protect  the  main  effects,  we 
could  choose  case  five  which  eliminates  linear,  and  quadratic  time  trends 
for  the  main  effects  and  linear  time  trends  for  the  first  order  interactions. 
To  construct  our  design  we  select  one  of  the  admissible  run  orders  found 
for  case  five,  os  t,lven  in  Figure  17.  This  selec.ion  can  either  be  made 
randomly  or  the  ore  with  the  minimum  total  number  of  factor  level  changes 
can  be  chosen.  For  our  exanple,  let  us  choose  the  design  which  minimizes 
the  factor  level  changes.  We  can  then  instruct  our  experimental  design 
by  going  back  to  the  standard  YateF  • ocation  and  writing  out  the  level  changes 
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Figure  19.  Operational  Teat  of  the  ZAP  Anti-tank  Weapon 
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for  the  five  factors  as  defined  by  the  level  change  numbers  given  in 
Figure  17.  This  design  is  given  in  Figure  20.  As  with  the  selection 
pf  a design  from  the  set  of  admissible  run  orders,  the  assignment  of 
the  five  factors  to  the  five  columns  can  be  done  either  randomly  or  by 
ordering  the  factors  based  on  which  factor  should  have  the  fewest  level 
changes  and  which  could  have  more  level  changes. 

Suppose  after  examining  Figure  20  we  feel  this  design  is  not  desirable 
because  the  number  of  factor  level  changes  for  visiblility,  weapon,  and 
terrain  are  excessive  for  the  reasons  discussed  in  paragraph  4 of  this 
chapter.  One  alternative  would  be  to  relax  the  constraints  on  the  elimina- 
tion of  higher  order  time  trends.  We  could  decide  to  select  a design 
which  eliminates  only  linear  time  trends  for  the  main  effects,  and  first 
order  interactions.  For  this  we  can  choose  case  six.  Figure  18  gives 
admissible  run  orders  for  case  six.  Going  through  the  same  procedure 
as  for  case  five,  we  come  up  with  the  design  given  in  Figure  21. 

Given  that  this  design  is  determined  to  be  staisfactory,  it  only 
remains  to  randomly  assign  a plus  or  minus  to  the  actual  level  names 
for  each  factor.  For  ease  of  planning  the  conduct  of  the  test,  it  may 
prove  convenient  to  display  the  design  information  of  Figure  21  in  a 
more  conventional  format  as  shown  in  Figure  22  where  the  number  in  each 
cell  gives  the  order  of  execution  of  each  test  event  in  filling  out 
the  full  factorial  design. 


Figure  20.  Case  5 Candidate  Design  for  the  ZAP  Test 
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Figure  21.  Case  6 Candidate  Design  for  the  ZAP  Test 
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8.  FUTURE  WORK.  The  computer  lo.-J  for  recursively  generating 

factorial  designs  having  mors  then  five  factors  would  be  desirable. 
Admissible  designs  with  a mix  of  two  and  three  level  factors  would  be 
more  realistic.  Of  further  cone'-.”"  wovi*  be  optimal  fractional  factorial 
designs. 
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ABSTRACT.  A method  of  estimating  error  variance  in  a 


non- replicated  experiment  by  separating  an  interaction  term 
into  sums  of  squares  of  non- additivity  and  sums  of  squares 


pertaining  to  error  was  examined.  A sequential  procedure 
to  test  individual  degrees  of  freedom  of  the  interaction 


to  test  individual  degrees  of  freedom  of  the  interaction  term 
for  non -additivity  was  introduced.  Five  test  statistics  that 
could  be  applied  to  the  sequential  procedure  are  given.  The 
critical  values  needed  for  each  of  the  test  statistics  for 
a ■ 0.05  and  0.15,  for  10*  20*  and  30  degrees  of  freedom  re- 


spectively in  the  term  being  tested*  and  for  three  stages  of 
the  sequential  procedure  were  estimated  by  Monte  Carlo  methods. 


The  five  test  statistics  were  compared  as  to  their  power 
and  ability  to  estimate  error  variance  when  non-additive  in- 
dividual sums  of  squares  were  combined  with  individual  sums 
of  squares  that  estimated  error  variance.  The  results  and 
recommendations  as  to  which  is  the  best  test  statistic  are 


given.  The  data  indicated  that  using  a higher  level  of  sig- 
nificance than  0.15  would  better  estimate  error  variance. 


! 


INTRODUCTION.  Frequently*  due  to  the  nature  of  an 
it  or  through  poor  planning,  a design  is  formed  with- 


out replication.  When  this  happens  the  experimenter  has  no 
estimate  of  experimental  error  in  his  data.  This  situation 


is  illustrated  in  Table  1 taken  from  Fisher  (1951).  Since 
each  entry  in  this  table  represents  a single  observation, 
there  is  no  way  to  estimate  experimental  error.  The  usual 
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solution  to  this  problem  is  to  assume  an  additive  model  (no 
interaction)  and  to  use  the  residual  sum  of  squares  as  an 
estimate  of  error.  In  a model  with  two  main  effects  this 
means  renaming  the  two-way  interaction  as  error.  For  the 
data  in  Table  1 the  three-way  interaction  alone  may  be  pooled 
into  error  or  possibly  the  three-way  and  one  or  both  of  the 
two-way  interactions  may  be  pooled  depending  upon  the  experi- 
ment and  the  analyst.  Having  an  estimate  of  the  error  the 
experimenter  may  now  be  able  to  test  other  terms  in  the  model 
that  weren't  testable  before  pooling. 

The  problem  with  this  procedure  is  that  some  of  the 
pooled  sums  of  squares  may  have  estimated  interaction  and 
not  error.  If  this  happens,  the  estimate  of  the  error  will 
be  too  large  giving  the  experimenter  a less  sensitive  test 
of  other  terms  in  the  model. 

How,  then,  can  it  be  determined  if  the  mean  square  of 
an  interaction  term  estimates  error,  interaction,  or  both? 
This  paper  examines  five  test  statistics  that  are  designed 
to  answer  this  question.  It  will  be  restricted  to  fixed 
models  with  one  observation  per  cell.  The  techniques  devel- 
oped can  be  applied  to  any  or  all  interaction  terms  in  any 
n-way  model. 


i 
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Using  the  Modified  Abbreviated  Doolittle  (MAD)  computer 
routine  developed  by  Bryce  (1970),  the  terms  of  a fixed 
model  can  be  broken  into  single  degree  of  freedom  sums  of 
squares.  These  single  degree  of  freedom  sums  of  squares 
form  the  building  blocks  of  the  five  test  statistics.  The 
individual  sums  of  squares  of  an  interaction  term  are  ranked  ) 

and  sequentially  tested  one  at  a time  starting  with  the  5 

largest  until  non-significance  is  declared.  At  this  point, 
the  significant  single  degree  of  freedom  sums  of  squares 
are  pooled  together  as  the  part  estimating  interaction  and 
the  rest  of  the  sums  of  squares  and  their  corresponding 
degrees  of  freedom  are  pooled  into  error  which  is  hopefully 
free  of  interaction. 

This  paper  will  compare  the  ability  to  find  interaction 
when  present,  or  power,  of  the  five  test  statistics  and  the 
ability  of  each  to  estimate  o2. 
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2.  TEST  PROCEDURE.  The  expected  mean  square  of  any 
interaction  term  can  be  broken  into  two  parts.  The  first 
part  contains  the  error  variance , a2,  and  the  second  part 
contains  the  sum  of  the  remaining  different  possible  vari- 
ance  components.  The  number  of  terms  in  the  second  part 
would  depend  on  the  ANOVA  model.  If  interaction  exists , 
then  the  mean  square  of  an  interaction  term  estimates 
the  sum  of  the  two  parts  of  the  expected  mean  square;  i.e., 
o2  plus  the  rest  of  the  terms.  However,  if  interaction 
does  not  exist,  the  mean  square  estimates  only  the  error 
variance.  If  for  a given  model  interaction  is  not  present, 
it  would  be  appropriate  to  pool  the  sums  of  squares  and 
degrees  of  freedom  associated  with  the  interaction  terms 
into  the  error  term. 

The  sum  of  squares  and  n degrees  of  freedom  of  a term 
in  the  model  can  be  partitioned  into  n sums  of  squares, 
each  associated  with  one  degree  of  freedom.  If  an  inter* 
action  term  is  so  partitioned,  the  resulting  single  degree 
of  freedom  sums  of  squares  estimate  either  error  variance 
or  interaction.  It  would  be  desirable  to  extract  the  por- 
tion that  estimates  error  only,  thus  giving  an  estimate  of 
o2  and  making  it  possible  to  test  other  terms  in  the  model. 
This  procedure  assumes  that  some  of  the  partitioned  single 
degree  of  freedom  sums  of  squares  estimate  o2  only  and  that 
not  all  estimate  interaction. 

The  steps  for  the  proposed  sequential  procedure  for 
testing  any  interaction  term  and  estimation  of  o2  are: 

1.  Separate  the  term  with  n degrees  of  freedom  into 
n sums  of  squares  containing  one  degree  of  freedom  each. 

2.  Rank  the  n sums  of  squares. 

3.  Apply  one  of  the  test  statistics  to  the  largest 
sum  of  squares. 

4.  Check  for  significance  using  the  appropriate  values 
in  the  table  for  a and  stage.  (Stage  is  the  number  of  the 
sequential  test  that  is  being  performed  on  the  individual 
sums  of  squares  of  an  interaction  term.  For  example,  stage 
one  is  the  test  of  the  largest  individual  sum  of  squares, 
stage  two  the  second  largest  and  so  on.) 
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5.  If  significance  is  declared , return  to  step  three 
using  the  sane  test  statistic  and  significance  level  to 
test  the  next  largest  sun  of  squares.  If  no  significance 
is  found,  proceed  to  step  six. 

6.  Pool  the  significant  sums  of  squares  and  degrees 
of  freedom  into  one  Interaction  term. 

7.  Pool  the  remaining  sums  of  squares  with  their 
appropriate  degrees  of  freedom  into  error. 

3,  TBST  STATISTICS.  The  proposed  test  statistics  will 
be  labeled  Fi,  fS,  F4,  and  F5  for  convenience  and  the  sum 
of  squares  of  a single  degree  of  freedom  interaction  term  will 
be  written  as  Si  where  (Si  < Sj  < ...  < Sn).  The  stage  in 
the  sequential  test  procedure  will  be  denoted  by  r and  n will 
denote  the  degrees  of  freedom  in  the  interaction  term  before 
testing. 

The  test  statistics  are: 

n 

z £i_ 

FI  . i-n-r+1  r 
n-r 

z sj 

j-1  n-r 

F2  • — 

S1 

„ _ sn-r+l 

F3  - 

Z Si 
i-1 


n 

Z 

i-n-r+1 

n 

E SJ 
j-l  J 

sn-r+l 
n-r 
Z Si 
i-1 


FI  could  be  described  as  the  sums  of  squares  having 
been  declared  significant  plus  the  test  sum  of  squares 
(the  individual  sum  of  squares  being  tested  for  significance) 
averaged  and  divided  by  the  average  of  the  remaining  sums  of 
squares.  F2  is  the  test  sum  of  squares  devided  by  the  small- 
est sum  of  squares.  F3  is  the  test  sum  of  squares  divided 
by  the  total  sums  of  squares  of  the  interaction  term.  F4  is 
a composite  of  FI  and  F3.  F5  is  the  numerator  of  F3  divided 
by  the  sum  of  the  sums  of  squares  less  than  the  test  sum  of 
squares. 


4.  GENERATION  OF  CRITICAL  VALUES.  The  sequential  test 
procedure  was  developed  to  test  the  hypothesis  of  no  inter- 
action present  in  the  single  degree  of  freedom  sum  of  squares 
of  any  interaction  term.  This  would  mean  that  each  of  the 
single  degree  of  freedom  interaction  sum  of  squares  estimate 
error  and  follow  a central  chi-square  distribution  with  one 
degree  of  freedom.  The  null  hypothesis  for  the  test  proce- 
fure  at  the  first  stage  could  be  written 

Hfl  : Xi  ■ X2"«  • • ■ Xn  ■ 0 

where  X^  represents  the  non-centrality  parameter  of  the  chi- 
square  associated  with  each  of  the  ordered  single  degrees  of 
freedom.  If  the  test  proceeds  to  the  second  stage  the  null 
hypothesis  would  be 

Hfl  : \\  - X2  ■ . . . - Xn.x  - 0 

and  so  on  at  other  stages  of  the  test. 

Under  the  null  hypothesis  it  is  possible  to  generate 
the  critical  values  for  each  test  statistic  using  one  degree 
of  freedom  central  chi-squares.  Two  parameters  affect  the 
shape  of  the  distribution  of  each  test  statistic;  the  stage 
of  the  test  and  the  number  of  degrees  of  freedom  in  the 
interaction  term  under  consideration.  Using  an  electronic 
computer,  the  distributions  of  each  of  the  test  statistics 
were  simulated  for  three  stages  and  interaction  terms  of  ten, 
twenty,  and  thirty  degrees  of  freedom.  The  upper  portion 
of  the  distributions  were  ordered  and  the  five  and  fifteen 
percent  points  were  found  thereby  giving  an  estimate  of  the 
0.05  and  0.15  critical  values  under  the  null  hypothesis. 

The  single  degree  of  freedom  chi-squares  were  formed 
by  generating  a standard  normal  value  and  squaring  it.  Each 
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standard  normal  was  generated  by  the  Box-Muller  (1958) 
transformation  using  uniform  values  generated  by  the  McGill 
Random  Number  Generator  Package,  supplied  by  McGill  Univer- 
sity. This  method  of  generating  standard  normals  was 
found  satisfactory  by  Thomas  (1975). 

A more  detailed  explanation  of  how  the  critical  values 
were  found  for  stage  one  and  ten  degrees  of  freedom  of 
interaction  will  now  be  given.  Ten  one-degree  of  freedom 
central  chi-aquares  were  generated  and  ordered.  A value  for 
each  of  the  five  test  statistics  was  calculated  and  saved. 

This  process  was  repeated  ten  thousand  Limes.  The  upper 
portion  of  the  ten  thousand  values  for  FI  was  ordered  and  the 
five  percent  and  fifteen  percent  points  were  found.  This 
gave  the  estimated  critical  values  for  a stage  one  test  of 
an  interaction  term  containing  ten  degrees  of  freedom  using 
FI  as  a test  statistic.  The  critical  values  were  found  in 
the  same  manner  for  F2,  F3,  F4,  and  F5.  This  process  was 
repeated  for  twenty  and  thirty  degrees  of  freedom  in  inter- 
action. 

Stage  two  critical  values  for  ten  degrees  of  freedom 
interaction  terms  and  a - 0.05  were  estimated  by  again 
generating  values  for  the  test  statistics  in  the  same  manner 
as  above.  If  generated  numbers  of  the  test  statistics  exceeded 
the  0.05  critical  values  with  ten  degrees  of  freedom  for  inter- 
action at  stage  one,  the  test  statistic  for  stage  two  was 
formed  and  saved.  This  was  repeated  until  two  thousand  values 
at  stage  two  were  accumulated.  The  upper  portion  was  ranked 
and  the  estimate  of  the  0.05  critical  value  for  stage  two  was 
found.  The  same  procedure  was  followed  to  find  the  table 
values  for  o ■ 0.15  and  so  on  for  twenty  and  thirty  degrees 
of  freedom  of  interaction. 

The  calculation  of  stage  three  critical  values  is  an 
extension  of  the  stage  two  procedure.  Critical  values  under 
the  null  hypothesis  were  calculated  and  if  they  exceeded  the 
appropriate  critical  values  of  both  stage  one  and  stage  two 
the  test  statistic  for  stage  three  was  formed  and  saved  until 
two  thousand  were  accumulated.  They  were  then  ordered  as  be- 
fore and  the  estimates  of  the  five  percent  and  fifteen  percent 
critical  values  were  found.  The  complete  table  of  critical 
values  generated  is  found  in  Table  2.  The  critical  values 
do  not  extend  past  stage  three  because  of  the  length  of  com- 
puter time  that  would  be  necessary  to  generate  stage  four 
critical  values. 
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CRITICAL  VALUES  FOR  FI,  F2, 
F3,  F4,  and  F5 


n is  the  total  degrees  of  freedom  associated  with  the  1 
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5.  CHOICE  OP  a.  It  nay  be  desirable  to  sake  the  test 

for  interaction  at  a relatively  small  alpha  rather  than  a 
large  one.  A small  a under  H#:  * Xg  * • * • " *n  * 

may  lead  to  an  inflated  estimate  of  o*  by  way  of  the  se- 
quential test  because  when  no  significance  is  found  the 
test  procedure  is  halted  and  the  error  sum  of  squares  is 
calculated.  A test  using  a small  alpha  may  not  find  inter- 
action when  it  is  present  thus  leading  to  an  inflated  esti- 
mate of  a2.  Therefore,  any  tests  of  other  factors  in  the 
model  using  the  inflated  error  would  be  conservative.  With 
this  in  mind,  critical  values  for  alpha  equal  to  0.05  and 
0.15  were  estimated. 

It  should  be  noted  that  the  level  of  significance  must 
remain  the  same  at  all  stages  of  the  test  when  using  the 
critical  values  developed  here.  For  example,  it  is  not 
appropriate  to  test  at  stage  one  using  a * 0.15  and  after 
finding  significance  to  test  at  stage  two  using  a ■ 0.05. 

6.  GENERATION  OF  POWER  DATA.  Power  in  a sequential 

test  is  an  elusive  concept.  For  this  reason,  power  at  stage 
one  is  defined  to  be  the  probability  of  rejecting  the  null 
hypothesis,  H#:  ■ \2  “ • • * " *n  * °*  the  null 

hypothesis  is  false.  Power  at  stage  two  is  the  probability 

of  rejecting  the  null  hypothesis,  H#:  A^  ■ X2  ■ . . . ■ Xn.i*  0, 
given  the  null  hypothesis  is  false. 

Data  generated  to  compare  the  power  of  the  five  test 
statistics  were  divided  into  two  cases.  Case  one  consisted 
of  generating  ten,  twenty,  or  thirty  standard  normal  de- 
viates, adding  a single  non-centrality  parameter,  A^,  to 
one  of  these  at  random,  and  squaring  each.  The  result  was 
one  non-central  and  (n-1)  central  cni- squares.  The  sequen- 
tial test  procedure  was  then  performed  using  one  of  the  test 
statistics  at  a level  of  significance  a.  This  was  repeated 
one  thousand  times  adding  tne  same  non-centrality  parameter, 

A to  a new  set  of  standard  normal  deviates  and  keeping  a 
record  of  the  number  of  times  significance  was  declared.  An 
estimate  of  power  for  the  test  statistic,  at  a,  n degrees  of 
freedom  for  interaction,  and  Ai  at  stage  one  was  calculated 
by  dividing  the  number  of  times  significance  was  declared 
by  one  thousand.  The  above  process  was  repeated  for  every 
possible  combination  of  test  statistics,  levels  of  signi- 
ficance, number  of  degrees  of  freedom  for  interaction,  and 
non-centrality  parameters.  The  non-centrality  parameters  are 


Aj  ■ 1.5,  Aj  “ 2.5,  A3  ■ 3.5,  and  A<  ■ 4.5.  The  sequential 
test  for  power  in  case  one  was  not  carried  past  the  first 
stage.  Tne  experiment  was  repeated  once  to  form  an  estimate 
of  experimental  error. 

A test  for  power  at  both  stage  one  and  stage  two  was 
performed  in  case  two  data,  n random  standard  normal  devi- 
ates were  again  generated  and  non-centrality  parameters 
were  added  to  two  randomly  selected  standard  normals  before 
squaring.  The  sequential  test  was  applied  and  the  process 
repeated  one  thousand  times  keeping  count  of  the  total  num- 
ber of  times  significance  was  declared.  Each  time  signi- 
ficance was  found  the  test  would  proceed  to  stage  two  to 
test  for  significance  and  a tally  was  kept  of  tne  number  of 
times  the  null  hypothesis  was  rejected. 

For  a certain  a,  test  statistic,  n degrees  of  freedom 
of  interaction,  and  set  of  non-centrality  parameters,  power 
at  stage  one  was  the  number  of  times  significance  was  found 
divided  by  one  thousand  while  power  at  stage  two  equaled 
the  number  of  times  the  null  hypothesis  was  rejected  at 
stage  two  divided  by  the  total  number  of  tests  made.  (The 
total  number  of  tests  made  at  stage  two  was  the  number  of 
times  significance  was  declared  at  stage  one.) 

The  above  power  for  case  two  was  calculated  indepen- 
dently for  each  combination  of  degrees  of  freedom  of  inter- 
action, test  statistics,  levels  of  significance,  and  pairs 
of  non- centrality  parameters.  As  in  case  one,  the  experiment 
was  replicated  on-.e.  There  were  ten  different  pairings  of 
Xi,  A-t  added  to  form  non- central  chi-squares.  These  are 
listed  in  Table  3. 

Table  3 

Pairings  of  Non-centrality  Parameters 
Added  for  Case  Two  Power 


7.  GENERATION  OP  MEAN  SQUARE  ERROR  DATA.  As  the  above 
procedure  for  power  was  being  performed  data  for  an  analysis 
of  the  ability  o£  the  test  statistics  to  estimate  o2  was 
also  being  compiled. 

As  each  set  of  ten,  twenty,  or  thirty  chi-squares  was 
generated  for  case  one  data,  the  test  procedure  would  check 
for  significance  at  different  stages  until  none  was  found. 

It  would  then  tally  the  sum  of  squares  and  degrees  of  freedom 
to  be  pooled  into  error.  This  would  proceed  until  all  one 
thousand  sets  were  tested.  The  estimate  of  o2  was  then  cal- 
culated by  dividing  the  total  sums  of  squares  pooled  into 
error  by  the  pooled  degrees  of  freedom.  If  significance  was 
found  at  each  of  the  first  three  stages  in  any  of  the  one 
thousand  sets,  (n-3)  degrees  of  freedom  and  the  sums  of 
squares  not  declared  significant  were  added  to  error.  Since 
these  data  were  calculated  simultaneously  with  the  power 
there  are  two  independent  observations  for  all  combinations 
of  test  statistics,  degrees  of  freedom  in  interaction,  non- 
centralities, and  levels  of  significance.  The  case  one  mean 
square  error  data  were  calculated  for  five  Aj,  four  being 
the  same  as  in  the  power  analysis  and  the  fifth  being  equal 
to  zero. 

Mean  square  error  data  for  case  two  were  generated  simul- 
taneously with  case  two  power  data.  As  both  a stage  one 
power  test  and  stage  two  power  test  were  performed  for  case 
two  data,  mean  square  error  data  were  also  collected  at  both 
the  stage  one  power  test  and  stage  two  power  test.  Case 
two  mean  square  error  data  will  be  labeled  and  discussed  in 
terms  of  stage  of  power  test.  This  avoids  the  problem  of 
thinking  of  the  MSE  data  as  "stage  one  MSE"  and  "stage  two 
| MSE"  which  carries  the  wrong  connotation  since  both  errors 
i are  estimated  using  the  three-stage  sequential  procedure. 

Mean  square  error  data  at  stage  one  power  test  were 
\ collected  as  follows.  The  sequential  (up  to  three  stages) 
jj  procedure  was  applied  to  each  set  of  n single  degree  of  free- 

I dom  interaction  sum  of  squares.  If  non-significance  occurred 

I at  stage  one  all  n sums  of  squares  were  pooled  into  the  error 
!|  estimate.  When  significance  was  declared  at  stage  one  but 
? not  at  stage  two  (n  • 1)  sums  of  squares  were  pooled  into 

\ the  error  estimate  and  with  significance  at  stages  one  and 

two  but  not  at  stage  three  (n  - 2)  sums  of  squares  were  pooled 
| into  the  error  estimate.  It  was  decided  if  significance  was 
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found  at  all  three  stage*  that  the  regaining  (n  - 3)  tuns 
of  squares  would  be  pooled  into  error.  Thus,  each  of  the 
one  thousand  sets  of  n suns  of  squares  contributed  sone- 
thing  to  the  estimate  of  error. 

Mean  square  data  at  stage  two  power  test  xere  collected 
in  a different  aanner  than  at  stage  one  power  test.  The 
sane  three  stage  sequential  procedure  was  applied,  but  only 
to  those  sets  of  n suns  of  squares  which  were  declared  sig- 
nificant at  the  stage  one  power  test.  If  non- significance 
was  observed  at  the  stage  one  power  test,  then  the  set  of 
n suns  of  squares  did  not  becone  a part  of  the  error  estinate 
at  the  stage  two  power  test.  Thus  fewer  than  one  thousand 
sets  of  n stuns  of  squares  were  used  in  the  stage  two  power 
test  estinate.  One  night  say  that  the  nean  square  error 
calculated  at  stage  two  power  test  is  "adjusted"  for  those 
cases  where  non- significance  was  found  at  stage  one  power  test* 

This  procedure  was  repeated  for  each  conbination  of 
n,  F,  a,  and  pairings  of  Xi,  Aj.  The  entire  process  was 
replicated  so  that  two  independent  estinates  of  error  were 
obtained  at  each  design  point. 

The  nean  square  error  data  at  stage  one  power  test  are 
the  values  of  interest  in  this  paper.  They  will  be  larger 
than  the  nean  square  error  values  calculated  at  stage  two 
power  test  because  the  sums  of  squares  and  degrees  of  freedom 
are  pooled  into  the  mean  square  error  at  stage  two  power  test 
only  if  significance  was  found  at  stage  one  power  test.  This 
neans  that  the  largest,  individual  sum  of  squares  that  is  not 
declared  significant  at  stage  one  is  never  pooled  into  the 
mean  square  error  at  stage  two  power  test.  If  one  decided 
to  estimate  o2  only  when  significance  was  found  at  the  first 
stage  of  the  sequential  procedure  then  the  values  of  mean 
square  error  at  stage  two  power  test  would  give  a picture  of 
the  results  one  might  expect  from  the  test  statistics.  How- 
ever, if  one  wanted  an  estimate  of  o2  independent  of  signi- 
ficance being  declared  at  stage  one  of  the  sequential  pro- 
cedure the  mean  square  error  data  generated  at  stage  one  power 
test  one  will  indicate  which  is  the  best  test  statistic. 

8.  METHOD  OF  ANALYSIS  OF  DATA.  Analysis  of  variance 
was  used  to  analyze  the  data  generated  for  case  one  power. 

A four-way  factorial  model  complete  with  all  interactions 
was  formed  using  degrees  of  freedom  of  interaction  (n),  test 
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statistic  (F),  non- centrality  parameter  (A),  and  signifi- 
cance level, (a), for  the  four  main  effects.  Degrees  of  free- 
dom of  interaction  had  three  levels  (ten,  twenty,  and  thirty), 
test  statistics  had  four  levels  (FI,  F2,  F3»  and  FS),  non- 
centrality parameters  had  four  levels  (1.5,  2.S,  3.5,  and 
4.5),  and  alpha  had  two  levels  (0.05  and  0.15).  F4  was  left 
out  of  the  analysis  in  case  one  because  power  wasn't  extended 
past  stage  one  and  at  stage  one  F3  and  F4  are  the  same  test 
statistic.  The  main  effects  for  this  model  and  for  all  models 
in  this  paper  were  considered  fixed. 

The  dependent  variable  in  the  power  analysis  is  a pro- 

fortion.  In  case  one  data  one  thousand  independent  tests 
or  power  were  mrde  for  each  combination  of  n,  F,  a,  and  A. 

The  proportion  was  formed  by  dividing  the  number  of  times 
the  null  hypothesis  wt  s rejected  by  the  total  number  of  tests 
made. 


Because  of  the  range  of  non-centralities  used  to  ge  irate 
the  data,  it  is  possible  that  the  assumption  of  homogene: as 
variance  in  each  cell  is  violated.  For  this  reason,  the  arc- 
sine transformation,  as  described  by  Snedecor  and  Cochran 
(1967),  was  used  on  the  data  but  ver;  little  difference  was 
found  between  the  analysis  of  the  ra*  data  and  that  of  the 
transformed  data  so  the  analysis  of  the  raw  data  was  used. 

Case  two  power  data  were  analyzed  using  a five-way  fac- 
torial model.  The  five  main  effects  were  degrees  of  free- 
dom for  interaction  (ten,  twenty,  and  thirty),  alpha  (0.0S 
and  0.15).  test  statistic  (FI,  F2,  F3,  F4,  and  F5) , non- 
centralities (the  ten  pairs  in  Table  3),  and  stage  (stage 
one  and  stage  two).  The  number  of  binomial  results  going 
into  each  observation  of  case  two  power  data  varied  with 
stage.  At  stage  one,  one  thousand  binomial  results  went 
into  each  observation  while  at  stage  two  the  number  of  bino- 
mial results  that  went  into  each  observation  were  the  number 
of  times  significance  was  declared  out  of  the  one  thousand 
trials  at  stage  one.  This  is  because  the  sequential  test 
procedure  doesn't  proceed  to  stage  two  unless  significance 
occurs  at  stage  one.  Analysis  was  performed  on  the  raw 
data  and  also  a weighted  arc- sine  transformation  of  the  data, 
weighted  by  the  number  of  binomial  results  making  up  each 
observation.  Very  little  difference  was  found  in  tne  results 
between  the  two  analyses  and  so  only  the  analysis  of  the  raw 
data  will  be  considered  here. 


i 

i 


5 


129 


i 


k-  *rri 


Before  describing  the  method  of  analyzing  the  mean 
square  error  data,  consideration  of  what  would  be  the  best 
estimate  of  mean  square  error  by  a test  statistic  in  this 
paper  will  be  made.  Ideally,  the  test  statistic  would 
identify  any  single  degree  of  freedom  sums  of  squares  that 
have  interaction  in  them  and  pool  into  error  only  the  sums 
of  squares  that  truly  estimate  error.  Each  single  degree 
of  freedom  that  estimates  error  is  a central  chi-square 
with  one  degree  of  freedom  and  with  expected  value  equal 
to  one.  Since  the  expectation  of  a sum  of  central  chi- 
squares  is  equal  to  the  sum  of  their  degrees  of  freedom, 
the  expected  value  of  the  pooled  sum  of  squares  of  error 
when  all  interactions  have  been  extracted  by  the  test  sta- 
tistic is  equal  to  the  pooled  degrees  of  freedom.  The 
expected  mean  square  error  would  then  be  equal  to  one.  If 
the  test  statistic  fails  to  remove  all  of  the  interaction 
the  expected  mean  square  would  be  greater  than  one.  If  the 
test  statistic  using  the  sequential  procedure  pools  only 
part  of  the  single  degree  of  freedom  sums  of  squares  that 
estimate  a2  into  error  the  resulting  mean  square  error 
would  be  less  than  one  on  the  average.  This  is  because  the 
sums  of  squares  of  error  left  in  interaction  would  be  the 
largest  sums  of  squares,  not  just  any  sums  of  squares  se- 
lected at  random,  leaving  the  smaller  for  error  thus  de- 
creasing the  expected  value  of  mean  square  error.  Hence, 
for  the  data  generated  here,  the  ideal  test  statistic  would 
yield  an  estimate  of  error  having  an  expected  value  equal 
to  one. 

Analysis  of  variance  was  also  used  to  analyze  the  mean 
square  error  data  of  case  one  and  case  two.  Although 
heterogeneity  of  variance  exists,  since  the  observ  ions 
are  central  or  non-central  chi-squares,  Scheffe'  (1^59) 
notes  that  if  an  analysis  is  balanced  the  heterogeneity  of 
variance  has  little  consequence.  This  was  seen  in  the 
analysis  of  the  raw  and  transformed  power  data.  The  analysis 
of  case  one  and  case  two  mean  square  error  data  was  performed 
on  the  untransformed  dependent  variable  using  the  error  es- 
timate produced  by  replication  to  test  terms  in  the  model. 

The  ANOVA  model  for  case  one  and  case  two  mean  square 
error  were  the  same  as  for  power  with  three  exceptions.  F4 
was  added  to  the  levels  of  the  main  effect  for  test  statis- 
tics in  case  one  since  it  will  estimate  mean  square  error 
differently  than  F3.  Zero  was  added  to  the  levels  of  the 
main  effect  for  non-centralities  to  investigate  the  ability 
of  the  test  statistic  to  estimate  o2  when  no  interaction  is 
present. 
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The  authors  of  this  paper  subscribe  to  the  philosophy 
that  when  it  is  not  desirable  or  possible  to  control  main 
effects  in  an  experiment  it  is  proper  to  test  for  signi- 
ficance among  the  levels  of  main  effects  in  the  presence 
of  interaction.  This  also  applies  to  the  testing  of  low 
ordered  interactions  in  the  presence  of  significant  higher 
ordered  interactions.  The  analyst  must  realize,  however, 
that  the  main  effects  and  lov  ordered  interactions  have  been 
averaged  over  all  other  factors  in  the  model  and  any  inter- 
pretation of  significance  must  be  viewed  in  this  light. 

The  analysis  of  the  power  and  mean  square  error  data 
will  be  discussed  a case  at  a time  instead  of  discussing 
power  completely  and  then  mean  square  error. 

9.  RESULTS  AND  DISCUSSION  OF  CASE  ONE  DATA.  Table  4 
is  the  analysis  of  variance  table  for  case  one  power  data 
and  Table  5 is  the  table  for  case  one  mean  square  error 
data.  Significance  was  found  for  almost  every  term. 

The  first  thing  to  be  considered  is  alpha.  Figure  1 
contains  graphs  of  power  and  mean  square  error  for  the  F 
by  o interaction. 

The  graph  of  power  in  Figure  1 indicates  that  the  power 
is  better  using  a larger  alpha  which  is  not  surprising, 
but  the  graph  of  mean  square  error  shows  that  a better  es- 
timate of  mean  square  error  is  obtained  using  a - 0.15 
since  the  line  for  a ■ 0.15  is  closer  to  one  than  that  for 
a ■ 0.05.  Table  5 shows  significance  for  main  effect  a 
which  indicates  that  using  a « 0.15  for  case  one  data  gives 
a better  estimate  of  mean  square  error. 

Now  consider  Figure  2 which  contains  graphs  for  the 
power  and  mean  square  error  of  the  F by  X by  a ■ 0.15 
interaction  term. 

There  is  no  significant  difference  between  the  pi.wer 
curves  of  FI,  F3,  F4,  and  F5  so  power  offers  no  help  as  to 
which  test  statistic  is  the  best  other  than  that  the  power 
of  F2  is  lacking.  The  graph  of  mean  square  error  in  Figure 
2 shows  that  F2  also  lacks  in  ability  to  estimate  mean 
square  error.  There  is  no  practical  difference  between  the 
points  of  FI,  F3,  F4,  and  F5  for  mean  square  error  at  X ■ 0, 
1.5,  2.5.  At  X ■ 3. 5,F3  is  significantly  higher  than  the 
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n 

2 

0.0030 

23.5678 

F 

3 

1.3084 

9997.0462 

nF 

6 

0.0009 

7.3491 

0 

1 

1.0141 

7748.2736 

no 

2 

0.0026 

20.5578 

Fa 

3 

0.0001 

0.8867* 

nFa 

6 

0.0002 

1.7236* 

X 

3 

3.0563 

23351.5011 

nX 

6 

0.0010 

8.1844 

FX 

9 

0.2475 

1891.2634 

nFX 

18 

0.0004 

3.5115 

aX 

3 

0.0097 

74.6679 

naX 

6 

0.0002 

2.1278* 

FaX 

9 

0.0019 

15.0068 

nFaX 

18 

0.0001 

1.1544* 

ERROR 

96 

0.0001 

* Indicates  that  the  term  was  not  significant  at  the 
.05  level.  No  * by  the  F value  indicates  significance  was 
declared  at  the  .05  level. 
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TABLE  S 


ANOVA  Table 

for 

Case  One  Mean  Square  Error  Data 

Source 

DF 

MS  F* 

n 

2 

1.1613 

4570.3185 

F 

4 

1.2276 

4831.2999 

nF 

B 

0.0866 

341.0213 

a 

1 

0.7622 

2999.7058 

na 

2 

0.0750 

299.0747 

Fa 

4 

0.001C 

4.2709 

nFa 

8 

0.0005 

2.1856 

X 

4 

0,8472 

3334.1864 

nX 

8 

0.1453 

571.9092 

FX 

16 

0.3739 

1471 .8028 

nFX 

32 

0.0295 

116.1555 

aX 

4 

0.0517 

203.4774 

naX 

8 

0.0077 

30.3502 

FaX 

16 

0.0026 

10.4372 

nFaX 

32 

0.0005 

2.0556 

ERROR 

ISO 

0.0002 

* All  tests  are 

significant  at  the 

.05  level. 
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other  three  and  at  X ■ 4.5,  F5  separates  from  FI  and  F4. 
At  X ■ 4.5  FI  and  F4  underestimate  error  while  F3  over- 
estimates error  and  F5  estimates  error  exactly. 
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The  problem  with  F2  is  that  it  will  find  significance 
if  the  smallest  sum  of  squares  is  sufficiently  small  with- 
out regard  to  the  size  of  the  largest  sum  of  squares.  Even 
if  the  largest  sum  of  squares  is  large  it  will  not  be  de- 
clared significant  unless  the  smallest  sum  of  squares  is 
sufficiently  small.  Thus,  F2  has  poor  power  and  greatly 
overestimates  mean  square  error. 

At  X ■ 4.S,  F3  estimates  a2  to  be  1.023.  This  is  sig- 
nificantly different,  using  Scheffers  test  at  a ■ 0.05, 
compared  to  the  F5  estimate  of  1.000.  As  X gets  large,  F3 
tends  to  overestimate  o2.  This  is  due  to  the  presence  of 
the  non-central  chi-square  in  the  denominator  of  F3. 

FI  and  F4  have  the  same  numerator 


i«n-r+l  r 

which  leads  to  their  underestimation  of  o2  at  X ■ 4.5.  The 
test  for  mean  square  error  in  case  one  only  goes  as  far  as 
stage  three.  Any  single  degree  of  freedom  sum  of  squares 
declared  significant  at  stage  one  will  remain  in  the  numera- 
tor for  the  stage  two  test.  One  large  single  degree  of 
freedom  if  interaction  sum  of  squares  could  easily  cause  a 
type  one  error  at  stage  two  because  of  the  inflated  numera- 
tor of  the  test  statistic.  This  would  lead  to  an  underes- 
timation of  a2. 

To  further  investigate  FI  and  F4  consider  the  graph  of 
nby  F by  a ■ 0.15  interaction  on  mean  square  error  which 
is  shown  in  Figure  3. 

The  points  of  FI  and  F4  for  n ■ 30  are  lower  than  one. 
As  the  number  of  individual  sums  of  squares  gets  larger  the 
probability  of  a large  central  chi-square  being  present 
increases.  The  numerators  of  FI  and  F4  will  be  inflated 
at  stage  two  with  one  significant  individual  sum  of  squares 
and  a large  central  chi-square  present.  Thus  a type  one 
error  at  stage  two  and  possibly  at  stage  three  could  occur. 
This  would  keep  large  central  chi-squares  from  being  pooled 
into  error  and  would  cause  an  underestimate  of  a2. 
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10.  RESULTS  AND  DISCUSSION  OP  CASE  TWO  DATA.  Tables 
6 and  7 contain  the  analysis  of  variance  tables  for  case 
two  power  and  mean  square  error  data  respectively.  Signi- 
ficance was  found  for  every  term  in  both  tables. 

To  find  the  better  a for  case  two  consider  Figure  4 
which  is  the  F by  a interaction  on  power  and  F by  a by 
stage  one  power  test  of  interaction  on  mean  square  error. 

As  in  case  one  a - 0.1S  estimates  mean  square  error 
better  than  a ■ 0.0S  but  Figure  4 shows  that  the  a ■ 0.1S 
curve  isn't  as  close  to  one  as  it  was  in  case  one  data. 

This  suggests  that  when  two  individual  sum  of  squares 
associated  with  interaction  are  present,  using  a higher  a 
will  better  estimate  o2.  Figure  4 also  shows  that  F2  has 
poor  power  and  greatly  overestimates  mean  square  error. 

For  these  reasons  F2  will  be  dropped  from  any  further  dis- 
cussion. 

Figure  4 also  shows  that  FI  and  F4  have  the  best  power 
of  the  five  test  statistics.  This  is  further  illustrated 
by  Figure  5,  a graph  of  F by  X at  o ■ 0.15  interaction  on 
power. 

The  power  of  FI,  F3,  F4,  and  F5  are  very  close  when 
pairs  of  X are  equal,  but  when  the  pairs  of  X become  un- 
equal the  pattern  changes.  As  the  difference  between  the 
non-centralities  gets  larger  the  difference  in  power  between 
FI  and  F4  compared  to  F5  and  F3  also  spreads.  The  reason 
for  this  becomes  obvious  after  seeing  Figures  6 and  7. 

Figure  6,  which  is  F by  X by  a ■ 0.15  by  stage  one  on 
power,  shows  no  practical  difference  in  power  between  FI, 

F3,  F4,  and  F5,  but  Figure  7,  which  is  F by  X by  a ■ 0.15 
by  stage  two  on  power,  shows  wide  differences  in  power. 

The  differences  in  Figure  5 originate  in  Figure  7 since 
Figures  6 and  7 make  up  Figure  5.  Figure  7 is  power  at 
stage  two  or  rejecting  H : Xj_  ■ X?  ■ . . . ■ Xn-i  ■ 0 when 
it  is  false.  The  real  difference  in  power  between  F5  com- 
pared with  FI  and  F4  begins  as  the  pairs  of  non-centralities 
start  to  spread.  FI  and  F4  have  better  power  because  the 
significant  sum  of  squares  at  stage  one  is  still  in  the 
numerator  and  when  it  combines  with  the  smaller  non-centra- 
lity significance  is  still  found.  At  the  same  time  F5  and 
F3  are  testing  the  smaller  non-centrality  alone  and  not 
finding  it  significant  as  often  as  FI  and  F4.  As  the  smaller 
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Source 


2.0076 

6662.1818 

4.7802 

15862.9815 

0.1806 

599.S040 

9. 3725 

31102.2565 

0.0699 

231.9816 

0.0824 

273.6274 

0.0035 

11.7148 

2.8400 

9424.7329 

0.1017 

337.8058 

0.1657 

550.1733 

0.0078 

25. 8975 

0.0291 

96.6522 

0.0093 

31.1496 

0.0026 

8.6435 

0.0007 

2.5451 

0.6061 

21922.1602 

0.2618 

868.7970 

0.3465 

1149.9805 

0.0427 

141.7871 

0.0652 

216.6943 

0.0072 

23.9867 

0.0349 

116.1258 

0.0027 

8.9653 

0.6665 

2211.8393 

0.0260 

86.2968 

0.0540 

179.3832 

0.0026 

8.8321 

0.0042 

14.1705 

0.0021 

7.1075 

0.0024 

8.2095 

0.0004 

1.5963 

0.0003 

t r represents  stage  of  power  test 
* Each  term  is  significant  at  the  .05  level. 
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TABLE  7 


ANOVA  Table  for  Case  Two  Mean 
Square  Error  Data 


n 

2 

37.5888 

62218.3820 

F 

4 

8. 5351 

14127.7110 

nF 

8 

0.2126 

351.9482 

0 

1 

6.5986 

10922.3722 

na 

2 , 

0.8028 

1328.8303 

Fa 

4 

0.0343 

56.9299 

nFa 

8 

0.0106 

17.6582 

X 

9 

6.7756 

11215.3054 

nX 

18 

2.3379 

3869.8933 

FX 

36 

0.3266 

540.7006 

nFX 

72 

0.0078 

13.0703 

aX 

9 

0.3890 

643.9728 

naX 

18 

0.0964 

159.6656 

FaX 

36 

0.0086 

14.3329 

nFaX 

72 

0.0036 

3.9810 

r 

1 

126.5534 

209475.7356 

nr 

7 

30.6301 

50700.0991 

Fr 

4 

0.5413 

896.1425 

nFr 

8 

0.0734 

121.5168 

ar 

1 

5.2837 

8745.8149 

Far 

4 

0.0951 

157.5172 

nFar 

8 

0.0332 

55.0646 

Xf 

9 

1.3656 

2260.4472 

nXr 

18 

1.0383 

1718.7646 

FXr 

36 

0.1123 

186.0435 

nFXr 

72 

0.0308 

51.0854 

aXr 

9 

0.1382 

228.7546 

naXr 

18 

0.0425 

70.5036 

FaXr 

36 

0.0046 

7.7365 

nFaXr 

72 

0.0027 

4.6175 

ERROR 

600 

0.0006 

t r represents  stage 

* Each  tern  is  significant  at  the  .05 

stage. 
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non-centralitv  gets  larger  the  power  of  F3  and  FS  also 
increases.  This  property  of  FI  and  F4  builds  their  power 
but  nay  not  help  their  ability  to  estimate  mean  square  error. 
Figure  8 is  a graph  of  the  F by  X by  a ■ 0.15  by  stage  one 
power  test  interaction  on  mean  square  error. 

The  only  three  places  that  mean  square  error  of  FI  and 
F4  are  significantly  closer  to  one  than  the  mean  square 
error  of  FS  are  where  the  non-centralities  are  (1.5,  4.5), 
(2.5,  4.5),  and  (4.S,  4.5).  This  is  due  to  the  numerators 
of  FI  and  F4  being  inflated  with  4.5  while  F5  is  testing 
1.5,  2.5,  and  4.5  alone.  This  may  be  fine  for  a test  using 
a ■ 0.15,  but  if  a * 0.25  were  being  used,  the  structure  of 
FI  and  F4  could  cause  them  to  seriously  underestimate  o«, 
whereas  FS  would  not  have  an  inflated  numerator  nor  inflated 
denominator  as  F3.  This  is  what  happened  when  testing  data 
with  one  non- centrality  of  4.5  present  in  case  one  as  illus- 
trated in  Figure  2.  Figure  4 contains  the  points  in  Figure 
8 averaged  over  non- centrality.  From  Figure  4 at  a ■ 0.15 
the  average  mean  square  error  values  are  1.401  for  F5,  1.391 
for  F4,  and  1.396  for  FI.  These  differences  can  be  attri- 
buted to  the  differences  observed  in  Figure  8 at  points 
where  the  added  non-centralities  were  (1.5,  4.5),  (2.5,  4.5), 
and  (4.5,  4.5).  The  differences  in  the  ability  of  FI,  F4, 
and  F5  to  estimate  error  variance  averaged  over  everything 
except  a and  stage  one  power  test  are  of  no  practical  impor- 
tance. 

Figure  9 is  analogous  to  Figure  3 in  case  one.  It  is 
the  n by  F by  a ■ 0.15  by  stage  one  power  test  interaction 
for  mean  square  error. 

At  n ■ 10  the  value  of  F5  is  significantly  closer  to  one 
than  FI  and  F4.  But  as  the  sample  size  increases  to  n ■ 20 
and  n ■ 30,  FI  and  F4  are  significantly  closer  to  one  than 
F5.  This  is  because  a large  central  chi-square  is  more  likely 
to  be  present  as  the  sample  size  increases.  And  the  inflated 
numerators  of  FI  and  F4  tend  to  declare  a portion  of  the  large 
central  chi-squares  significant  whereas  F5  does  not.  If  a 
large  a were  being  used,  FI  and  F4  may  underestimate  error 
whereas  F5  may  avoid  this  problem  because  of  its  structure. 

11.  ANALYSIS  OF  DATA  IN  TABLE  1 USING  SEQUENTIAL 
PROCEDURE".  Table  8 is  an  analysis  of  variance  table  of  the 
data  contained  in  Table  1. 


MEAN  SQUARE  ERROR 


TABLE  8 


ANOVA  Table  for  Data  in  Table  1 


Source 

DF 

SS 

MS 

A 

5 

21221.0 

4244.2 

B 

1 

3798. S 

3798.5 

AB 

5 

6893.9 

1378.8 

C 

4 

5310.0 

1327.5 

AC 

20 

4433.0 

221.7 

BC 

4 

291.8 

73.0 

ABC 

20 

2784,-2 

1 / | 

139.2 

ERROR 

0 

0.0 

0.0 

TOTAL 

59 

44732.4 

The  AC  and  ABC  interaction  terms  were  partitioned  into 
single  degrees  of  freedom  sums  of  squares  and  the  sequen- 
tial  procedure  using  FI,  F3,  F4,  and  FS  was  applied  to  the 
data.  No  indication  of  interaction  was  found  using  a ■ 0.15 
in  either  the  AC  or  ABC  term.  Thus,  both  could  be  pooled 
into  error  giving  an  estimate  of  a2  equal  to  180.43,  however 
interaction  could  be  present  in  most  or  all  of  the  single 
degree  of  freedom  sums  of  squares  of  AC  and  ABC,  which  may 
lead  to  a type  two  error  using  the  sequential  procedure. 

12.  CONCULSIONS.  Based  on  the  results  of  this  paper, 
FI  and  >4  may  be  as  good  a test  statistic  as  FS  if  the  re- 
maining sums  of  squares  are  pooled  into  error  when  signifi- 
cance is  declared  at  stage  three.  F5  estimates  a2  better 
in  case  one  data  then  FI  and  F4  but  in  case  two  data  there 
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is  no  practical  difference.  If,  however,  more  complete 
tables  were  available  (higher  significance  levels  and 
critical  values  for  more  than  three  stages)  the  authors 
would  recommend  F5  as  the  best  of  the  five  test  statistics. 
F5  avoids  the  pitfalls  of  FI  and  F4  which  would  probably 
manifest  themselves  in  much  greater  detail  if  critical 
values  for  more  stages  and  larger  a were  available. 

As  far  as  level  of  significance  is  concerned  0.15  is 
recommended  over  0.05  because  of  the  better  estimate  of 
o2  given.  As  the  number  of  individual  sums  of  squares 
associated  with  interaction  increases  a larger  value  of  a 
will  better  estimate  o2.  This  can  be  seen  by  comparing 
Figure  1 with  Figure  4.  The  results  indicate  that  with  a 
higher  a,  perhaps  0.25,  o2  would  be  estimated  with  less 
bias  than  at  a ■ 0.15. 

These  conclusions  can  only  be  strictly  applied  to  the 
data  analyzed  in  this  paper.  Any  extension  to  three  or 
more  individual  sums  of  squares  containing  interaction 
without  further  research  is  speculation. 
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PLANNING  QUANTAL  RESPONSE  TESTS  FOR  ORDNANCE 
DEVICES!  THE  TWO-POINT  STRATEGY 


R.  E.  Little 
School  of  Engineering 
The  University  of  Michigan 
Dearborn , Michigan 


ABSTRACT.  This  paper  presents  a small  sample  strategy 
that  should  prove  to  be  useful  in  predicting  high  reliability 
(or  high  safety)  for  ordnance  devices.  The  recommended 
"two-point"  strategy  was  developed  by  the  author  for  analogous 
use  in  estimating  fatigue  reliability. 

Briefly,  the  "two-point"  strategy  incorporates  the  well- 
known  up-and-down  (Bruceton)  strategy  in  its  first  stage 
to  generate  two  (nonzero,  nonunity  probability)  points  along 
the  assumed  response  distribution  curve.  Then,  in  its  second 
stage,  the  strategy  allocates  the  remaining  specimens  to  the 
two  corresponding  stimulus  levels  such  that  the  variance  of 
the  point  estimate  pertaining  to  the  reliability  (safety) 
of  interest  is  minimized. 

In  essence,  the  issue  is  to  find  the  specimen  allocations 
which  minimize  the  variance  associated  with  extrapolation 
along  the  fitted  response  distribution  to  a point  remote 
to  the  median.  Optimally,  this  minimization  requires  testing 
certain  specific  proportions  of  the  available  specimens  at 
carefully  selected  specific  stimulus  levels. 

1.  INTRODUCTION.  The  sensitivity  of  explosive  devices 
I to  shock  loading  cannot  be  measured  directly.  Rather,  the 
| explosive  device  must  be  subjected  to  some  arbitrary  shock 
I loading  and  if  the  given  device  explodes  we  know  that  the 
i imposed  shock  loading  exceeded  its  tolerance  to  shock  loading. 
i On  the  other  hand,  if  the  given  device  does  not  explode,  then 

I we  know  that  the  imposed  shock  loading  did  not  exceed  itB 

| tolerance  to  shock  loading.  Conducting  similar  shock  loading 
| tests  at  various  (stimulus)  levels  generates  the  following 
I quantal  response  test  program: 
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Stimulus  Level 
(e.g./  drop  height) 


Number  of  Specimens 
Tested 


Number 
Responding 
(e.g.,  exploding) 


n. 


n. 


n, 


The  problem  of  interest  herein  is  how  to  select  s^  and 

ni  such  that  we  obtain  the  most  precise  estimate  of  the  critical 

stimulus  level  s^  corresponding  to  a very  low  (high)  probability 

of  responding  p,  e.g.,  0.001  or  even  0.00001  (0.999  or  even 
0.99999).  Specifically  we  shall  describe  our  two-point  test 
program  and  estimation  method  [1,2].  The  two-point  strategy 
requires  considerably  fewer  specimens  than  current  techniques 
such  as  the  run  down  method  [3]. 

2.  OPTIMAL  REGRESSION  BACKGROUND.  The  foi living  discussion 
is  intended  to  serve  as  background  material  for  the  subsequent 
summary  of  the  two-point  sti.-tegy. 

2.1.  Simple  Linear  Regression  Example.  Consider  the 
problem  of  most  precise  estimation  of  the  slope  8 for  the 
simple  linear  model 


Y = a + 0x  + e 


(1) 


Assuming  a homoscedastic  variance  a , the  variance  of  8 is 
given  by  the  expression 


Xni^xi  “ x)2 


(2) 


2. 


Elementary  analysis  (or  intuition)  shows  the  takes  on 

its  minimum  value  when:  (a)  only  two  levels  of  x^  are  used 

in  testing,  (b)  these  levels  are  spaced  as  widely  apart  as 
practical,  and  (c)  ntotal/2  specimens  are  tested  at  each  of 

the  two  x.  levels,  where  n.  . is  the  fixed  number  of 

X lOVuI 

specimens  available  for  testing. 
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This  elementary  example  illustrates  the  minimum  variance 
strategy  in  planning  test  programs.  Namely,  select  the 
stimulus  levels  and  allocate  the  test  specimens  such  that  we 
minimize  the  variance  of  some  estimate  of  direct  interest. 

This  minimum  variance  strategy  may  be  applied  to  models  with 
heteroscedastic  variances  and  with  time  and/or  cost  constraints 
12]. 


2.2.  Optimal  Regression  Derivations  for  Linear  Response 
Curved  We  shall  now  discuss  minimum  variance  estimation  o£ 
a point  on  the  linear  response  curve 

y * F'^p)  * a + 0s  (3) 

in  which  s refers  to  the  stimulus  level  and  p ■ F(y)  is 
the  distribution  of  interest  (e.g.,  normal,  logistic,  extreme 
value-smallest) . The  heteroscedastic  binomial  variance 
associated  with  sampling  at  a given  stimulus  level  is 

<^p)  - pq/n  (4) 

in  which  p is  the  true  probability  of  responding,  q « (1  - p) , 
and  n is  the  number  of  specimens  tested  at  the  given  stimulus 
level. 

/v 

We  may  now  use  the  variance  expression  for  p to  obtain 
a variance  expression  for  the  variate  y,  using  the  simple 
2 2 2 

relation  a (ax)  = a a (X)  and  the  assumed  distribution  p * F(y) 
to  obtain  dp/dy,  viz., 


<y> 


(dy/dp)  [pq/n] 


(5) 


Now  by  analogy  with  the  simple  linear  regression  example 
above,  we  conduct  response  tests  at  just  two  stimulus  levels. 
Specifically,  we  test  n^  specimens  at  stimulus  level  s^  and 

n2  specimens  at  stimulus  level  s2,  where  ni  + n2  s ntotal 

specified  prior  to  testing.  We  assume  that  r^  specimens 

respond  during  the  tests  at  and  r2  respond  at  s2.  Hence, 

the  respective  proportions  responding  are  p^  ■ r^/n^  anc* 

P2  ■ r2/n2»  These  values  are  then  used  to  compute  the 

corresponding  y^  values  using  the  relationship  y^  = F_1(p^), 

in  which  p * F(y)  is  the  distribution  function  assumed  for 
the  response  curve. 
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Th0  response  curve  of  interest  appears  in  Figure  1. 
Two  parameter  distributions  plot  as  a straight  line  on 
appropriate  probability  paper,  passing  through  the  two 
points  f(y1#  sx>,  ( y2 , s2)].  Hence, 

a - (y1»2  - y2»1)/(m2  * *i)  (6) 


8 " (Yo  “ - S,) 


Then,  for  any  point  along  the  line,  say  (yA,  s*) , we 
write  0 0 


yi*2  ~ y2si  ii-r. 


y0  ■ ° + 6,o  • * + <»o>  <»> 

and,  since  y^  and  y2  are  independent,  we  see  that 


[ — y0)  ]2a2*  + r 3(y0)  ,2  2« 

<yl)  3(y2)  (y2> 


in  which 

><Yq)  u 

»<?!)  ’ 


(s2  - *0)/<s2  - s1)  and  — — - » - (8l  - s0)/(s2  - s.) 

3(y,)  A u * 1 


»<y«) 


Next,  we  substitute  o?*  . and  c^  . into  (9)  and  introduce 
the  notation  'yl'  W 2 ' 


to  obtain 


(S,  - sn)^  (s.  - s J 


(B2  - BX) 


niWl 


'1  “o' 

+ ---n  w ° ] (12) 
n2W2 


Our  problem  now  is  to  minimize  (12)  by  appropriate  selection 
°“  n^,  n2»  s^,  and  s2* 


4 
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First#  consider  optimum  allocation  of  and  n2  for 

given  values  of  s^  and  s2«  Substitute  n^  ■ - »2 

into  (12)#  and  set  the  derivative  of  (12)  with  respeot  to 
n2  equal  to  sero.  Me  thus  obtain  the  expression 


Jnp*  0 


(8,  - Sn) 


<•2  - »o>‘ 


(ntotal  “ n2*  W1 


(B1  * 


n2  W2 


Equation  (13)  is  satisfied  when 


n,  w2  4 

(“)  - ± <n*>2 
n2  W1 


s,  ■ ■#»  wa  » y2  - yQ 

I.2  -/■>  - 1 <!T>  lyf— y7> 
•l  *0  W1  yl  y0 


where  the  plus  sign  pertains  to  extrapolation  and  the 
minus  sign  pertains  to  interpolation. 

Substituting  (14)  back  into  (12)  gives  (after  some 
algebra) 


(•a  - 


(,1  “ 


* »1> 


(yi  ’ yo> 


ntotal(y2  " yl* 


where  again  the  plus  sign  pertains  to  extrapolation  and 
the  minus  sign  pertains  to  interpolation.  This  variance 
expression  may  now  be  minimized  by  appropriate  selection 
of  and  y2* 

Taking  the  derivatives  of  (15)  with  respect  to  yx 
and  y2  and  equating  these  derivatives  simultaneously  to 
zero  shows  that  the  optimum  values  of  y^  and  y2  are  indepen- 
dent of  the  value  of  yQ  of  specific  interest.  However# 

because  of  the  complex  nature  of  the  w#  p (w,  y)  relationship# 
the  optimum  values  roust  be  determined  numerically#  refer’ to 
Table  1. 
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Distribution 

Optimum  y 

Optimum  p 

yl 

y2 

Pi 

P2 

Normal 

-1.575 

+1.575 

0.058 

0.942 

Logistic 

-2.399 

+2.399 

0.083 

0.917 

Extreme  Value 
- Smallest 

-2.073 

+1.269 

0.118 

0.971 

Table  1. 


Optimum  y and  jj  value*  for  minimum  variance 
eatimation  of  yQ. 


NOTE i Remarkably  the  optimum  value*  also  pertain 
to  minimum  variance  eatimation  of  B,  but  the 
corresponding  optimal  allocation*  differ.  The 
optimum  allocation*  for  minimum  variance  estimation 
of  B satisfy  nj/n2  - (w2/w1)a/2. 


Value  of  yQ 

- 1.575 

- 2.0 


- 3.0 

- 4.0 


Variance  Ratio 
(Normal  Distribution) 

1.000 

1.16 

4.6 

63.5 


Table  2.  Ratio  of  transformed  binomial  variance  o^A  for 

'Yq' 

a11  ntotal  tMtB  conducted  at  stimulus  level  sQ, 
to  the  optimal  regression  variance  o?A  . These 

'y0J 

example  results  pertain  to  the  normal  distribution. 
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2.2.1.  Discussion  off  Results.  It  is  helpful  in  under- 
stand Xng“tH(rTe8urtiTi^iarTzi3~In  Equation  (15)  and  Table  1 
to  plot  w versus  p.  Refer  to  Figure  2.  Here  we  see  that  the 
weight  w approaches  sero  as  p approaches  zero  or  one  (vis., 
as  y approaches  minus  infinity  or  plus  infinity) . This  w, 

P (w,  y)  relationship  indicates  that  i$  we  attempt  to  separate 
s^  and  »2  t0°  widely,  the  variance  of  y^  increases  because 

w in  the  denominator  of  Equation  (15)  approaches  zero.  On  the 
other  hand,  if  we  do  not  separate  s^  and  s2  enough,  then  the 

term  (s2  - s^2  in  the  denominator  is  too  small.  Thus,  there 

are  unique  values  of  s^  and  s2  (independent  of  8q)  which 

minimize  (15)  — not  too  far  apart  and  not  too  close  together. 

It  is  also  helpful  in  understanding  the  optimal  (weighted) 
regression  results  herein  to  compare  the  variances  of 

associated  with  optimal  regression  and  with  direct  testing 
at  the  single  stimulus  level  sQ  corresponding  to  yQ,  refer 

to  Table  2.  Here  we  see  that  optimal  regression  is  much  more 
efficient  than  direct  testing.  The  reason  for  the  increased 
efficiency  is  essentially  that,  as  evident  in  Figure  2,  direct 
testing  at  very  low  or  very  high  p values  is  extremely 
inefficient  because  the  weights  w are  almost  zero  (i.e.,  the 
transformed  binomial  variability  is  so  large) . The  optimal 
regression  strategy,  on  the  other  hand,  allocates  specimens 
to  stimulus  levels  where  the  weights  are  not  only  much  higher 
than  the  weights  associated  with  direct  testing  at  extreme 
values  of  p,  but  it  also  minimizes  the  increase  in  the  variance 
of  yQ  associated  with  extrapolation.  It  iB  clear  from  the 

results  summarized  in  Table  2 that  optimal  regression  is 
remarkably  suited  to  the  problem  of  estimating  stimulus  levels 
corresponding  to  very  high  and  to  very  low  probability  of 
response. 


2.2.2.  Application  to  Ordnance  Problems.  The  optimum 
values  of  p in  Table  1 are  too  close  to  zero and  one  to  have 
direct  application  in  ordnance  problems.  The  difficulty  lies 
in  selecting  s^  and  s2  such  that  we  do  not  obtain  all  response 

or  all  non-responses  at  either  s1  or^s2.  If  either  situation 

occurs,  we  cannot  establish  the  two  y values  required  to 
specify  the  fitted  distribution.  Thus,  to  use  the  optimal 
regression  results  directly,  we  require  very  accurate  initial 
estimates  of  a and  3.  This  requirement  is  of  course  quite 


Figure  2.  Plot  of  w,  p relationships  for  the  normal, 

logistic,  and  extreme  value-smallest  distributions 
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impractical.  Thus,  w«  must  modify  th«  optimal  regression 
strategy  50  maka  aura  that  wa  can  always  aatabliah  tha  two 
raquirad  y valuaa.  Tha  modifiad  procedure  ia  tarmad  tha 
two-point  strategy. 

3.  THE  TWO- POINT  STRATEGY.  There  are  two  varaiona  of 
the  two-point  strategy,  one  for  small  samples,  say  fifty 
specimens  or  lasa,  and  one  for  large  aamples,  say  one 
hundred  or  more  specimens. 

3.1.  Small  Sample  Procedure.  The  small  sample  procedure 
ia  as  followst  (1)  conduct  the  beginning  portion  of  the  test 
program  using  the  up-and-down  strategy  illustrated  in  Figure  3, 
(2)  change  over  to  testing  at  only  two  stimulus  levels  s^  and 

a j as  soon  as  two  finite  values  of  y are  established  by  the 

up-and-down  portion  of  the  test  program,  and  (3a)  allocate  the 
test  specimens  to  s^  and  s2  as  the  test  progresses  using 

Equation  (14)  to  decide  between  testing  at  a1  or  s2»  or  (3b) 

proceed  as  in  (3a)  except  test  at  the  t$o  stimulus  levels 
corresponding  to  the  optimum  values  of  p in  Table  1.  (These 
two  levels  may  be  updated  as  the  test  progresses.  The  iterative 
procedure  may  be  quite  worthwhile  when  s^  and  a2  are  closely 

spaced*®* . ) 

The  up-and-down  portion  of  the  two-point  test  program 
should  generally  be  undertaken  with  the  uniform  spacing 
between  successive  stimulus  levels  chosen  to  be  approximately 
equal  to  the  standard  deviation  of  the  underlying  response 
distribution.  If  the  spacing  is  too  narrow,  the  resulting 
values  of  Sj^  and  s2  in  the  two-point  testing  portion  of  the 

program  will  generally  be  too  close  together  to  permit  precise 
estimation  of  y^.  On  the  other  hand,  if  the  spacing  is  too 

wide,  the  up-and-down  portion  of  the  test  program  tends  to 
be  quite  long,  with  the  successive  test  outcomes  alternating 
back  and  forth  between  response  and  nonresponse.  Thus,  a 
reasonably  accurate  estimate  of  the  standard  deviation  of 
the  assumed  underlying  distribution  is  mandatory,  viz.,  there 


(a)  Ideally  tha  investigator  haB  a computer  program  which 
records  the  given  test  outcome  and  provides  the  stimulus 
level  for  the  next  test.  Otherwise,  the  computations  may 
take  place  at  convenient  intervals  as  the  test  program 
progresses. 
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X denotes  Response , 


0 denotes  Nonresponse 


Stimulus 

Level 


2.0 

1.7 

1.4 

1.1 

0.8 


Figure  3. 


Test  Number 
123456789  10 


XX  X 

0 X 0 X 

0 X 


XXXXO 

0000000000000000X0 


Up-and-down  Testing  Two-point  Testing 
(a)  (b) 


Data  Summary: 


■i 

ni 

ri 

Pi 

2.0 

1 

1 

1.000 

1.7 

3 

3 

1.000 

1.4 

9 

6 

0.667 

1.1 

20 

2 

0.100 

The  two-point  test  program  consists  of:  (a)  a 

beginning  ug-and-downA series  of  tests  to  establish 
two  finite  y values  (p  values  not  equal  to  zero  or 
one) , followed  by  (b)  tests  conducted  at  two 
stimulus  levels,  s^  and  $2*  which  specimens  allocated 

to  s^  or  s2  as  the  overall  test  progresses  such 

that  text  Equation  (14)  is  satisfied. 


NOTE:  The  up-and-down  test  strategy  is  as  follows: 

The  outcome  of  any  given  test  determines  the 
stimulus  level  used  in  the  next  test.  For  example, 
the  second  specimen  responded  (denoted  X) , thus 
the  third  specimen  was  tested  at  a lower  stimulus 
level.  On  the  other  hand,  the  third  specimen  did 
not  respond  (denoted  0)  and  therefore  the  fourth 
specimen  was  tested  at  the  next  higher  stimulus 
level.  Uniform  spacing  between  adjacent  stimulus 
levels  is  used  for  convenience,  but  is  not  mandatory. 


must  be  some  preliminary  tasting  or  soros  prior  experience  to 
form  a basis  for  selecting  the  spacing  of  the  stimulus  levels 
used  in  testing.  Generally  an  estimate  of  the  standard 
deviation  o that  is  accurate  within  plus  or  minus  fifty 
percent  is  adequate#  but  it  is  preferable  that  the  spaoing 
d fall  in  the  range  o < d < (3c/2) . The  advantage  of  the 
iterative  procedure  (3b)  increases  as  d is  decreased  below  o. 

Many  readers  will  probably  opt  for  the  simplified  test 
method  and  analysis.  In  this  case  we  merely  ignore  the 
tests  conducted  at  stimulus  levels  other  than  s1  and  s2 

(refer  to  Figure  3)  and  estimate  the  fitted  distribution  by 
drawing  a straight  line  through  the  two  points  [(y1#  s^), 

(y2,  s2)l.  The  variance  of  yQ  is  then  estimated  using 

Equation  (12)  and  reading  w from  Figure  2. 

If  it  does  not  seem  advisable  to  ignore  fcests  at  stimulus 
levels  other  than  s^  and  s2#  the  variance  of  yQ  may  be 

estimated  using  the  general  expression 


-u  i i Eniwi(s1  - sw) 

The  w^  values  in  (16)  may  be  approximated  either  by  empirical 
weights  (i.e.,  based  on  the  observed  y^  values)#  or  fitted 
weights  (e.g.#  based  on  maximum  likelihood  analyses  [2]). 

3.1.1.  Numerical  Example  (Simplified  Analysis).  Given 


the  quantal  response  data  in  Figure  3 (ignoring  the  tests 
at  stimulus  levels  other  than  1.4  and  1.1) « viz.# 


Solution.  First,  we  shall  check  the  allocation  of  n, 

A A A 

and  n2*  relative  to  the  final  values  of  p.  For  » 0.100, 

y,  from  normal  tables  equals  - 1.28;  and  for  p2  - 0.667, 

y2  equals  + 0.43.  Moreover , ^f or  pg  ■ 0.001,  yQ  ■ - 3.09. 

The  corresponding  values  of  w are  0.34  and  0.60  respectively. 
Thus,  using  (14) 


1 

0.60  .2 
0.34  ' 


, + 0.43  - ( - 3.09)  , 
( -"1.28  - ( 3.09)  1 


2.6 


whereas  the  actual  value  is  20/9  ■ 2.2.  This  discrepancy 
means  that  if  further  tests  were  conducted,  the  first  few 
additional  tests  should  be  conducted  at  s.  » 1.1  — unless 

A A 

of  course  the  p values  change  markedly  as  the  data  accumulate. 


The  fitted  response  distribution  passes  through  the 
points  [(1.1,  - 1.28),  (1.4,  + 0.43)],  giving  the  response 
expression 

y - - 7.55  + 5.07s 


Hence,  yQ 
0.78.  In 


2 

(y0) 


M 


■ - 3.09  (p0  ■ 0.001)  corresponds  to  Sg  equal 
turn,  using  (12) 

1 r (1.4  - 0.78) 2 . (1.1  - 0.78) 2 , 

(1.4  - i.u*  1 20  x 0,34  " 9 x 0.60  J 


Thus 


The  corresponding  lower  95%  asymptotic  confidence  band 
appears  in  Figure  4.  Note  that  we  can  be  approximately 
95%  confident  that  99.9%  of  all  specimens  will  survive  a 
stimulus  level  of  0.22. 

3.2.  Large  Sample  Procedure.  The  large  sample  proce- 
dure is  based  on  information  obtained  by  response  tests 
conducted  using  the  previous  small  sample  procedure.  Namely, 
approximately  fifty  specimens  are  tested  using  the  small 
sample  procedure  to  estimate  s£  and  s£  corresponding  to  the 

optimum  p values  in  Table  1.  Then,  given  this  information, 
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.■alas  Level 


f'S'HSM.ta  .««« 


I the  remaining  specimens  are  tests  at  sj  or  at  s*2  using  (14) 

j for  appropriate  allocation;  or  else  each  successive  specimen 

| may  be  tested  at  that  stimulus  level  which  minimizes  (16) 

S as  the  data  accumulate.  The  latter  iterative  procedure  is 

[ enhanced  by  a digital  computer  program  compiled  and  placed 

| in  a file  ready  for  execution  by  remote  terminal. 

I 4.  SUMMARY.  The  procedure  is  straightforward;  (a) 

select  the  appropriate  values  of  the  stimulus  level,  and 
I (b)  allocate  the  tests  at  these  stimulus  levels  such  that 
! the  variance  of  the  desired  point  estimate  is  minimized. 

[ Usually  the  variance  of  the  desired  point  estimate  may  be 
reduced  markedly  merely  by  considering  a few  alternative 
| stimulus  levels  before  testing  (using  Figure  2 and  Equation 
16) . But  the  variance  of  the  point  estimate  may  be  reduced 
? even  further  by  adopting  certain  minimum  variance  strategies. 
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TECHNIQUES  FOR  STATISTICALLY  DETERMINING  FLIGHT 
SUITABILITY  OF  AN  ARTILLERY  PROJECTILE 


Ronald  Corn 
Gertrude  Weintraub 
Ammunition  Development  and 
Engineering  Directorate 
Picatinny  Arsenal 
Dover,  New  Jersey 


ABSTRACT.  The  M483  155mm  Projectile  being  tested  at 
Nicolet,  Canada,  to  evaluate  aeroballistic  performance  at 
high  air  density  exhibited  flight  instability.  The  authors 
were  responsible  for  determining  cause  of  problem,  correcting 
the  problem  and  developing  the  statistical  technique  necessary 
for  predicting  success.  The  projectile  design  modifications 
evolved  successfully  passed  retesting  at  Nicolet  and  the  pro- 
jectile has  been  released  for  production.  The  induced  yaw 
technique  for  disturbing  projectiles  as  they  exit  the  gun  tube, 
developed  during  this  program,  is  currently  being  used  on  other 
developmental  projectiles  and  will  be  used  to  evaluate  aero- 
dynamic stability  of  all  future  Howitzer  type  projectiles. 

The  statistical  techniques  used  to  predict  success  which 
also  permitted  a minimal  expenditure  of  projectiles  were: 

a.  A Weibull  mathematical  model  waB  selected  and  imple- 
mented to  predict  point  estimates  and  confidence  level  estimates 
of  reliability  and  percentage  points  based  upon  the  maximum 
likelihood  estimates  of  the  parameters  of  a Weibull  population. 
This  model  afforded  excellent  theoretical  descriptive  character- 
istics of  the  density  and  probability  distributions  of  the 
empirical  test  data  which  were  symmetrical  and  asymmetrical 
in  form. 


b.  Automated  computer  programs  especially  adapted  to 
the  Weibull  model  were  employed  to  derive  density  and  proba- 
bility distribution  curves. 

c.  Probability  plotting  methods  were  implemented  to 
describe  the  adequacy  of  the  theoretical  distributions  to  the 
empirical  test  data. 

1.  INTRODUCTION . The  M483  projectile  development  which 
was  completed  in  1971  provided  an  important  new  155mm  capa- 
bility to  the  US  Army.  Figure  1 depicts  an  M483  projectile 
alongside  of  the  standard  155mm  M107  projectile.  Because  of 
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the  obvious  increase  in  size  and  cargo  volume,  over  50%  of 
the  standard,  the  M483  configuration  is  being  utilized  for  a 
variety  of  projectiles  whose  mission  is  to  deliver  cargo  on 
to  a target  area  (e.g.  chemical,  smoke,  illuminating  and  sub- 
munition). 

. To  accommodate  the  increased  cargo,  the  M483  projectile  is 
over  6 calibers  in  length  and  utilizes  an  aluminum  ogive  and 
base  and  fiberglass  wrapped  body  to  minimize  weight  and  distri- 
bute it  properly  for  aerodynamic  considerations.  Because  of 
its  unique  shape,  comparatively  little  knowledge  of  its  aero- 
dynamic characteristics  was  available  prior  to  1974  when  sur- 
prisingly poor  performance  was  exhibited  in  cold  weather  tests. 

In  1974  a cold  weather  test  program  was  conducted  at 
Nicolet,  Canada,  located  between  Montreal  and  Quebec  along 
the  Saint  Lawrence  River  (Figure  2).  Nicolet  provides  an 
existing  Canadian  test  facility  which  permits  projectile 
firings  at  near  Arctic  conditions  to  evaluate  aeroballistic 
performance  at  high  air  density  (in  excess  of  110%  of  standard), 
which  tends  to  amplify  aerodynamic  instabilities. 

On  14  Feb  74,  20  each  M483  projectiles  were  fired  with 
a standard  US  Propellant  charge  whose  weight  was  adjusted  to 
obtain  a velocity  of  Mach  0,93.  At  these  Arctic  conditions 
this  Mach  number  was  predicted  to  be  the  most  severe  aero- 
dynamically.  The  impact  point  of  13  of  those  projectiles 
which  exhibited  normal  flight  performance  is  shown  in  Figure 
3.  These  projectiles  impacted  on  expected  ranges  of  approx- 
imately 6300  meters.  Seven  of  the  twenty  projectiles  impacted 
between  2000  and  3300  meters  short  of  the  impact  area  as  shown 
in  Figure  4. 

Production  of  the  M483  was  suspended  as  a result  of  the 
incident  at  Nicolet  and  an  intensive  program  initiated  to 
determine  the  cause  of  the  erratic  performance  at  Nicolet. 
Initially  a fault  tree  was  configured  (Figure  5)  and  an  in- 
vestigative program  was  developed  based  upon  fault  tree 
elements . 

To  determine  whether  the  cause  of  the  problem  was  routed 
in  interior  or  exterior  ballistics,  it  was  necessary  to  con- 
duct a highly  instrumented  series  of  firings  which  for  the 
first  time,  would  obtain  initial  yaw  characteristics  of  a 
statistical  sample  of  in-flight  projectiles,  as  well  as  pro- 
jectile range  information  for  those  same  projectiles.  Figure 
6 3hows  the  test  site  at  Yuma  Proving  Ground.  Cameras  and 
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yaw  cards  were  used  to  independently  m*a.  re  launch  angles 
of  the  projectile  while  radar  and  staru&rd  triangulation 
techniques  were  used  to  determine  flight  characteristics  and 
dowm  range  impact  points.  Launch  velocities  were  adjusted 
from  standard  US  velocities  to  duplicate  the  critical  mach 
number  of  the  Nicolet  tests  by  modification  of  propelling 
charges. 

The  results  of  the  initial  tests  showed  that  the  M483 
problem  was  primarily  an  exterior  ballistic  problem  and  that‘ 
in  fact,  the  aaroballistic  characteristics  of  the  projectile 
were  unsatisfactory.  Ms.  Weintraub's  application  of  statis- 
tical techniques  proved  invaluable  for  predicting  performance 
and  follows  in  detail. 

2.  STATISTICAL  TECHNIQUES  USED  IN  A FLIGHT  SUITABILITY 
INVESTIGATION.  At  the  outset,  !!  want  to  take" this  opportunity 
to  express  my  gratitude  to  Mr.  Corn  and  his  associates  in  this 
stability  investigation.  They  were  open-minded  and  willing  to 
draw  upon  statistical  disciplines  to  assist  them  in  resolving 
an  engineering  problem.  The  result  of  the  cohesive  union  of 
engineering  and  statistics  proved  successful. 

A complex  problem  was  solved  when  a probabilistic  ap- 
proach was  applied  to  analyze  real  world  test  data.  Professor 
John  Tukey  of  Princeton  would  probably  refer  to  the  statistician's 
efforts  in  our  data  analysis  as  exploratory  and  probabilistic 
and  the  end  result  as  confirmatory.  Our  greatest  gains  in 
analyzing  empirical  data  came  from  surprises,  which  I will  ex- 
plain a little  later. 

In  this  case,  the  engineering  community  succeeded  in 
ferreting  out  the  causes  for  short  rounds  (defined  as  thoBe 
which  fail  to  fly  to  full  range)  and  redesigned  the  projectile 
to  eliminate  the  occurance  of  short  rounds. 

As  statistician,  I entered  the  picture  after  the  following 
events  had  occurred: 

1.  On  10  Feb  74,  seven  out  of  twenty  standard  M483  pro- 
jectiles fired  at  critical  Mach  number  (0.93)  from  the  109A1 
Howitzer  flew  approximately  half  range. 

2.  The  engineering  community  undertook  an  '.nvestigation 
by  designing  a test  program  to  determine  the  cause  of  these 
short  rounds.  The  program  wasacomplex  and  ambitious  one  and 
sought  to  determine  whether  the  problem  was  either  interior  or 
exterior  ballistic  related. 
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3.  Aerodynamic  knowledge  at  the  start  of  the  investigation 
supported  the  belief  that  the  M483  was  stable  up  to  a first 
maximum  yaw  angle  of  8°. 

The  first  test  conducted  at  Yuma  Proving  Grounds  was  with 
the  standard  M483  fired  at  critical  mach  number  in  order  to 
correlate  first  maximum  yaw  angle  with  range.  The  yaw  angles 
were  obtained  with  yaw  cards  and  cameras  as  back  up,  the  test 
set  up  is  shown  in  Figure  6.  The  yaw  oards  were  set  approxim- 
ately 100  feet  forward  of  the  gun. 

Figure  7 is  a plot  of  the  first  maximum  yaw  angle  vs. 
range  and  the  first  surprise  of  this  test  program  was  that  the 
critical  yaw  angle  was  5-6°  and  not  8°  as  previously  predicted. 
Critical  yaw  angle  is  defined  as  the  angle  above  which  the 
projectile  becomes  aerodynamioally  unstable  and  does  not  fly 
full  range. 

The  yaw  angles  generated  from  20  tests  conducted  with  the 
standard  M483  projectile  (varying  its  internal  cargo,  tubes  and 
muzzle  brakes)  were  presented  for  analysis.  As  had  been  done 
on  other  problems,  a probabilistic  design  approach  was  used. 

Yaw  angle  was  considered  the  continuous  random  variable  and  the 
problem  was  to  examine  the  distribution  of  yaw  angle.  I chose 
to  fit  a Weibull  distribution  model  since  it  afforded  me  a use- 
ful mathematical  tool  for  describing  the  probability  distribution 
function  and  the  density  function  of  symmetrical  and  asymetrical 
forms.  Figure  8 shows  a spectrum  of  distributional  forms  which 
can  be  described  by  a Weibull  model  (see  Figure  9 for  the  pdf 
and  density  mathematical  forms  of  the  Weibull  distribution). 

In  terms  of  a statistical  probability  distribution,  the 
distribution  of  yaw  angles  for  the  standard  M483  Projectile 
fired  from  a 50%  worn  tube  at  Yuma  is  seen  on  Figure  10.  It 
was  determined  that  this  condition  tube  produced  the  highest 
first  maximum  yaw  distribution  and  this  tube  was  used  for  most 
of  the  testing. 

Maximum  likelihood  estimates  of  the  parameters  of  a 
Weibull  population  were  determined  based  upon  the  iteration 
procedures  for  joint  maximum  likelihood  estimation  of  the  3 
parameters  of  the  Weibull  population  described  by  Harter  and 
Moore  in  their  notes  contained  in  Technometrics,  Volume  7, 

No.  4,  November  1965.  The  asymptotic  variances  and  covariances 
of  maximum-likelihooo  estimators  were  then  employed  in  deriving 
confidence  interval  estimates  for  probabilities  based  upon  the 
MLE  estimates.  The  latter  confidence  interval  estimates  were 
derived  with  the  assistance  of  Dr.  Einbinder  and  members  of  the 
Computer  Programming  Facility  at  Picatinny  Arsenal. 
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Based  upon  the  maximum  likelihood  estimates  of  the  3 i 

Weibull  parameters,  one  oould  expect  33%  of  the  standard  M483  1 

Projeotiies  fired  from  the  50%  worn  tube  to  exceed  5°.  And,  in  1 

fact,  at  Nicolet,  Canada,  7 out  of  20  (3S%)  fell  short.  This  | 

gave  further  credence  to  the  low  oritical  maximum  yaw  angle  j 

premise . 

The  fitted  yaw  distribution  function  also  indicated  that 
for  the  standard  M483  to  fly  full  range,  its  oritical  first 
maximum  yaw  angle  must  be  greater  than  13°.  At  this  critical 
yaw  angle  one  can  expect  no  more  than  one  short  range  projectile  I 

in  a million  rounds.  i 

, t 

i 

Thereafter,  the  investigative  test  program  was  directed  | 

to  assessing  the  effects  of  system  parameter  changes  on  the 
yaw  angle  distribution  and  the  design  of  modifications  that 
would  have  high  critical  yaw  angles.  The  system  parameters 
investigated  Included:  new  tubas  and  worn  tubes,  with  and 

without  muzzle  brakes,  and  cargo  variation.  It  appeared  that 
the  greateet  effeot  on  yaw  angle  level  was  the  presenoe  or  j 

absense  of  a muzzle  brake  on  the  end  of  the  gun  tube.  i 

j 

Figure  11  shows  how  absence  of  a muzzle  brake  improves  ] 

the  yaw  angle  probability  distribution  of  the  standard  M483.  j 

Now  only  7 in  10,000  rounds  are  expeoted  to  exceed  the  5° 
critical  yaw  angle  in  lieu  of  33%  with  a muzzle  brake.  This  1 

frequency  was  also  too  high  to  be  acceptable.  | 

The  real  problem  facing  the  engineering  task  team  wan  to 
design  a projectile  modification  whose  critical  angle  exceeded 
13°  , since  as  previously  shown  no  more  than  one  short  range  | 

round  in  a million  would  be  expected  at  this  critical  yaw  angle.  | 

After  many  design  modifications,  and  statistical  analyses  'J 

of  these  changes,  two  modifications  of  the  standard  M483  were  i 

built  and  tested:  Figure  12  describes  the  modifications  made  j 

to  the  standard  M483i  Figure  13  compares  the  yaw  angle  probabil-  1 

ity  distribution  functions  obtained  for  Mods  1 and  2 when  tested 
with  the  50%  worn  tube  with  muzzle  brake.  For  each  Mod,  it  was  3 

found  that  one  in  a million  rounds  would  exceed  8°  first  maximum  J 

yaw  angle. 

Since  the  modifications  were  designed  to  be  more  stable 
than  the  standard  M483,  a technique  had  to  be  devised  for  deter- 
mining how  much  more  stable  they  were  and  also  their  critical  J 

yaw  angle.  j 

1 

Since  it  had  been  determined  that  muzzle  brakes  signifi-  | 

cantly  effected  yaw  angles,  modified  muzzle  brakes,  Figure  13A,  j 
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were  designed  and  tested  as  a means  of  inducing  even  greater 
yaw  angles  to  evaluate  design  modifications.  First  maximum 
yaw  angles  of  as  high  as  20°  were  obtained. 

Figure  14  illustrates,  visually,  by  means  of  a yaw  oard 
comparison,  the  large  angle  from  which  the  modified  rounds 
will  still  damp  and  fly  normal  ranges  as  compared  to  the 
original  M483  projectile,  Figure  15. 

An  interim  Picatinny  Report  dated  March  1975  has  been 
published  covering  this  work.  Figures  16  and  17  show  the 
adequacy  of  the  Weibull  model  in  describing  the  empirior.1 
distribution  characteristics  of  test  data  for  the  standard 
M483  round  and  for  design  modification  2. 

This  probability  plotting  method  was  used  to  assess  the 
goodness  of  fit  of  the  theoretical  Weibull  model  to  the  em- 
pirical test  data. 

Figures  18  and  19  show  the  density  function  for  the 
standard  M483  and  design  modification  2.  Each  of  the  distri- 
butions is  right-skewed,  but  we  can  see  that  modification  2 
shows  a significantly  smaller  dispersion  around  the  mean. 

Summing  up,  therefore,  what  modification  2 accomplished 
is  two- fold: 

1.  It  yielded  a significantly  smaller  dispersion  of 
first  maximum  yaw  angle  around  the  mean,  one  in  a million 
exceeds  8°  vs.  33%  exceeding  5°  for  the  standard  M483, 

2.  It  produced  a more  stable  projectile,  critical  angle 
greater  than  18°  vs.  5-6°  for  the  standard  M483. 

3.  CONCLUSION . A real  world  engineering  problem  was 
resolved  with  the  assistance  of  probability  methods.  Statis- 
tical analyses  were  helped  immeasurebly  by  computer  software 
programs  which  were  available.  These  programs  afforded  rapid 
assessment  of  design  modifications  and  comparisons.  The 
efforts  could  not  possibly  have  been  accomplished  in  as  short 
a time  without  the  computer.  The  computer  program  of  Drs. 
Harter  and  Moore  of  Wright-Patterson  Air  Force  Base  was  used 
extensively  to  derive  the  maximum  likelihood  estimates  of  the 
Weibull  parameters.1  Software  programs  available  at  Picatinny 


1 As  an  aside,  gratitude  is  extended  to  Dr.  Badrig 
Kurkjian  for  Introducing  Picatinny  Arsenal  to  the 
Harter  Moore  program  which  has  proved  to  be  invaluable 
in  helping  to  solve  many  engineering  problems. 


172 


tiwittawE^ 


Arsenal , sptolfleally  In  the  Concepts  and  Effectiveness 
Division,  contributed  greatly  toward  the  successful  evaluation 
of  test  data. 

4.  STATISTICAL  CONTRIBUTION. 

1.  Statistical  probability  techniques  fixed  the 
critical  yaw  angle  for  the  standard  M483  Projectile. 

2.  Statistical  analysis  predicted  the  yaw  angle 
probability  distribution  for  many  modifications  and  for  dif- 
ferent tubes.  These  distributions  provided  the  engineering 
task  team  with  essential  information  for  directing  their 
efforts  toward  projectile  modification. 

3.  a.  For  the  first  time,  probability  design  was 
used  to  prediat  projectile  performance  using  a minimal  number 
of  rounds.  Cost  reduction  and  risk  associated  with  future 
artillery  development  programs  should  follow. 

b.  The  application  of  probability  design  served 
a twofold  purpose : 

<1)  It  predioted  the  probability  of  exceeding 
a given  yaw  for  a specifio  design  M483  Projectile. 

(2)  It  afforded  the  engineering  task  team 
a goal,  in  this  case,  a 13°  critical  yaw  angle}  so  that  their 
efforts  were  directed  toward  achieving  this  goal  in  order  to 
eliminate  short  rounds. 

4.  A Blue  Ribbon  Panel  especially  assigned  to  over- 
view the  stability  investigation  approved  the  efforts  and 
findings  of  the  investigative  team  and  commended  all  members 
of  the  team  for  their  analysis  of  and  correction  to  the  pro- 
jectile flight  problem.  The  panel  further  stated  that  "in 
the  course  of  this  program  much  has  been  learned  that  is  of 


basic  value  in  the  ballistic  design  and  development  of  project- 
iles." Further,  the  panel  recommended  that  the  "team  can  well 
undertake  future  new  and  interesting  designs  of  special  shells" 
and  recommended  that  this  project  be  well  documented  for  future 
guidance. 


CONCLUDING  REMARKS; 

As  a result  of  the  program  and  techniques  just  described,  modifications 
1 and  2 were  extensively  tested  at  Nlcolet  during  the  winter  of  1975.  Both 
modifications  performed  satisfactorily  as  predicted.  Modification  2 was 
selected  since  It  did  not  result  In  Internal  cargo  volume  loss  and  It  was 
recently  released  for  production  as  the  M483A1. 
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APPLICATION  OF  LIFE  TESTING  TECHNIQUES  TO  DETECTION  DATA 

Carl  B.  Bates  and  Jerry  Thomas 
Applications  Group 

Methodology  and  Resources  Directorate 
US  Army  Concepts  Analysis  Agency 
Bethesda,  Maryland 

ABSTRACT.  Life  testing  techniques  for  censored  sample  data  are 
discussed.  Singly  and  progressive  censoring  of  type  I and  type  II  are 
defined.  The  detection  phenomenon  Involving  observers  not  always  detecting 
targets  Is  placed  In  the  framework  of  progressively  censored  sampling. 

Maximum  likelihood  estimates  for  the  parameters  of  the  two-parameter 
Weibull  distribution  are  given,  and  a test  statistic  Is  presented  for 
comparing  two  Weibull  distributions  fitted  to  censored  sample  data. 

Weibull  distributions  of  sample  sizes  500,  250,  and  100  having  0,  10,  and 
20  percent  censored  are  simulated.  The  shape  parameter  Is  varied  over  the 
range  1.0  to  3.5  and  equality  of  pairs  of  the  distributions  Is  tested. 

The  relationships  between  Beta  and  the  Beta  difference  that  is  distinguishable 
are  given  for  each  of  the  three  sample  sizes.  For  the  largest  sample 
size,  at  the  0.5-level  of  significance,  the  Beta  difference  that  Is 
distinguishable  varied  from  0.15  for  small  shape  parameter  values  to 
0.38  for  large  shape  parameter  values.  For  the  100  sample  size  distribu- 
tions, the  Beta  difference  distinguishable  varied  from  0.30  to  0.73. 
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I.  INTRODUCTION 


The  detection.  Identification,  and  localization  of  eneiny  targets 
Is  an  Integral  part  of  many  US  Army  studies.  These  studies  may  be 
classified  Into  either  computer  simulated  experimentation  or  field 
conducted  experimentation.  Field  experimentation  Involving  the  detec* 
tlon  process  Is  usually  performed  to  estimate  or  compare  the  effective* 
ness  of  materiel  or  methods  of  employment.  Often  empirical  data  from 
the  field  experimentation  Is  then  used  as  Input  to  computer  simulation 
models,  or  the  analysis  results  of  the  empirical  data  are  used  to 
provide  the  basis  of  simulating  detection  In  computer  simulation  models. 

Because  of  the  "no  detections"  (observers  not  detecting  exposed 
targets)  which  occur  In  field  experimentation  Involving  detection 
processes,  the  analysis  of  empirical  detection  data  presents  unique 
problems.  In  the  sections  which  follow,  the  analysis  problems  are 
discussed  and  a proposed  analysis  methodology  Is  presented  and  Illustrated. 
II.  PROBLEM  DESCRIPTION  AND  BACKGROUND 
A.  Problem  Description 

A field  experiment  Involving  candidate  land  combat  systems  Is 
designed  and  conducted.  One  of  the  many  measures  of  effectiveness  of 
the  systems  Is  detection  time.  During  the  conduct  of  the  experiment, 
however,  the  systems  do  not  always  detect  exposed  enemy  targets.  There- 
fore, detection  time  data  Is  not  collected  for  all  of  the  planned  trials 
of  the  field  experiment.  Consequently,  the  original  orthogonal  design 
for  the  experiment  Is  nonorthoqonal  with  respect  to  the  response  variable, 


detection  time.  The  objective  of  this  report  Is  to  present  a method  of 
analysis  which  uses  both  the  detection  times  of  detected  targets  end  the 
exposure  times  of  undetected  targets. 

B.  Background 

Land  combat  experimentation  Involving  the  detection  of  targets 
Invariably  results  In  targets  not  being  detected  for  some  of  the  experi- 
mental trials,  e.g.,  Cavlness  et  al.  (1972)  and  McKinney  et  al.  (1971) 
and  (1972).  Treating  the  "no  detect"  trials  as  missing  values  and  apply- 
ing one  of  the  statistical  techniques  for  estimating  missing  values  does 
not  have  appeal  because  It  does  not  utilize  all  the  available  information 
From  the  experimental  data,  namely,  the  duration  of  the  time  that  llne-of- 
slght  existed  between  the  observer  and  the  target.  Ignoring  the  no  detect 
trials  and  analyzing  only  the  data  from  trials  for  which  a detection  did 
occur  does  not  have  appeal  for  the  same  reason.  Moreover,  analyses  based 
on  all  available  experimental  data  addresses  the  unconditional  detection 
probability  of  Interest,  whereas  analyses  based  on  only  trials  for  which 
a detection  did  occur  addresses  the  conditional  probability  of  detection, 
given  a detection  has  occurred. 

A search  for  a proper  method  of  analysis  of  the  detection  times 
which  would  utilize  the  target  exposure  times  of  the  no  detect  trials  led 
to  the  area  of  life  testing.  It  was  concluded  that  the  detection 
phenomenon  when  all  targets  are  not  detected  Is  similar  to  the  censored 
sample  situation  In  life  testing. 


III.  LIFE  TESTING 

| In  life  testing  a number  (N)  of  components  are  tested  and  the  time 

to  a component's  failure  Is  recorded.  If  components  are  withdrawn  from 
the  test  before  failure  (In  our  case  a target  passes  from  an  exposed 
state  to  a concealed  state  without  being  detected)  the  sample  Is  termed 
censored.  Censoring  may  be  of  two  types: 

1.  Type  I - In  which  at  some  predetermined  fixed  time,  say  t., 

testing  Is  terminated,  or  9 

2.  Type  II  -In  which  after  some  predetermined  fixed  number, 

say  n,  of  sample  Items  fall,  testing  Is  terminated. 

With  each  type  of  censoring,  the  collected  data  consists  of  the  n failure 
times  tj,  t2>  ...»  tn>  plus  the  Information  that  the  remaining  (N-n)  Items 
survived  beyond  the  time  of  termination,  t0  for  Type  I and  tR  for  Type  II. 

The  above  described  censoring  Is  termed  singly  censored  samples.  If, 
however,  the  Initial  censoring  results  In  withdrawal  of  only  a portion 
! of  the  surviving  Items,  with  some  remaining  under  test  until  ultimate 

i 

; failure  or  until  a subsequent  stage  of  censoring  Is  performed,  we  have 

i 

progressively  (multiple)  censored  samples.  In  general  then  censoring 
occurring  progressively  In  k stages  at  times  T^;  1*1,2,...,k,  and  at 
each  1th  stage  of  censoring  r^  sample  Items  are  selected  randomly  from 
the  survivals  at  time  T^  and  removed  (that  Is,  censored)  from  further 
observation.  This  Is  analogous  to  our  detection  phenomenon.  We  have  a 
target  coming  from  a concealed  state  to  an  exposed  state  just  as  a test 

Item  starting  under  observation  during  test.  If,  however,  a target 

i 

passes  from  an  exposed  state  back  to  a concealed  state  without  being 
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detected,  It  Is  removed  from  further  observation  at  a time  T^  (equal  to 
the  target's  total  exposure  time).  Further,  In  our  case  each  of  the  k r^ 
equal  one  because  In  general  the  exposure  times  of  any  two  or  more  unde- 
tected targets  are  not  Identical. 

Past  experience  has  shown  a positive  skewness  In  the  empirical  data 
distributions  of  time  variables  associated  with  the  target  detection 
process.  Bates  (1971)  and  McKinney  et  al.  (1971)  and  (1972).  Moreover, 

In  McKinney  et  al.  (1972)  It  was  shown  that  the  two-parameter  Welbull 
distribution  gave  adequate  approximations  to  detection  time  sample  dis- 
tributions. In  the  probability  density  function  (pdf)  of  the  two-param- 
eter Welbull  distribution, 

f(x)  - (B/ae)xS_1  expl-(x/a)6l;  x > 0,  c.  > 0,  B > 0,  (1) 

a Is  the  scale  parameter  and  6 Is  the  shape  parameter. 

The  Welbull  distribution  provides  considerable  flexibility  for 
approximating  a variety  of  distributions.  When  B ■ 1 we  have  the  exponen 
tlal  distribution  and  when  b * 3.5  we  have  a distribution  very  close  to 
the  normal  distribution.  In  FIGURE  1 on  the  next  page,  the  Welbull  pdf 
Is  shown  for  three  different  shape  parameters.  The  middle  curve  is  a 
positively  skewed  distribution  similar  to  that  of  our  target  detection 
times. 
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0 < 1 


6 ■ 1.5  e * 3.5 


FIGURE  1.  Wei  bull  Probability  Density  Function 


The  flexibility  of  the  Welbull  distribution  can  be  further 
Illustrated  In  terms  of  the  cumulative  distribution  function  (cdf). 

In  the  context  of  our  detection  problem,  the  cdf  F(x^)  is  the  probability 
of  detection  by  time  x^.  FIGURE  2a  Is  an  S shaped  cdf  similar  to  that  of 
a normal  distribution.  FIGURE  2b  Illustrates  the  cdf  of  a Welbull 
distribution  having  the  same  shape  parameter  as  the  distribution  In 
FIGURE  2a,  but  a larger  scale  parameter.  FIGURE  2c  has  the  same  scale 
parameter  as  FIGURE  2a,  but  a smaller  shape  parameter. 
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FIGURE  2,  Wei  bull 


Cumulative  Distribution  Function 


pm 


IV.  ESTIMATION 

The  first  step  In  the  analysis  process  Is  the  approximation  of  the 
distribution  of  target  detection  times.  This  Involves  estimating  the 
two  parameters,  a and  s,  of  equation  (1).  Substituting  a and  6 for 
a and  B In  f (*)  gives  the  approximation  distribution,  f(x),  of  target 
detection  times.  The  estimation  technique  which  Is  employed  evolved 
from  life  testing. 

Cohen  (1963)  shows  that  although  Intermediate  steps  In  the  deriva- 
tions differ,  the  maximum  likelihood  estimation  equations  for  Type  I 
and  Type  II  progressively  censored  samples  yield  the  same  end  result. 

The  maximum  likelihood  estimation  equations  for  the  two-parameter  Weibull 
distribution  are  given  In  Cohen  (1965).  The  equations  are  nonlinear  In 
the  parameters  and  must,  therefore,  be  solved  by  Iterative  procedures. 

He  solves  the  expression, 

A * A 

( (E*x1 In  x^/E  x?)-(1/b5]  * ( 1/n) E In  xi  (2) 

A 

for  s.  The  asterisk  denotes  that  the  sumnation  is  over  the  entire  sample 
with  the  r^  observations  censored  at  time  assigned  the  value  x^  * T^. 
Then,  substituting  a obtained  from  equation  (2)  into  the  other  maximum 
likelihood  estimation  equation,  aln  L/3a,  and  solving  for  o he  gets 

S = (’I 

where  In  L is  the  logarithm  of  the  likelihood  function.  Substitution 
of  the  two  obtained  parameter  estimates,  a and  b,  into  equation  (1) 
yields  the  desired  approximation  distribution  f (x ) . 
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The  mean  of  f(x)  is 
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i 

p-. 

I 

i 

r 


£(x) 


5 r(i  ♦ l/d). 


(4) 


and  the  approximate  variance  Is 

V(x)  * (»f/3a)2V(«)  + (af/3g)2V(e)  ♦ 2(3f/3a)(3f/30)Cov(«J).  (b) 


V.  HYPOTHESIS  TEST1N6 

Suppose  that  In  a field  experiment  two  candidate  detection  devices  are 
under  study.  One  of  the  primary  objectives  of  the  experiment  Is  to  com- 
pare the  detection  distributions  of  the  two  devices  and  make  Inferences 
concerning  the  equality  of  the  two  populations.  After  applying  the 
estimation  techniques  In  the  previous  section  to  the  empirical  detection 
data  collected  on  the  performance  of  the  two  devices  to  approximate  the 
distribution  for  each  device,  we  are  now  Interested  In  comparing  these  two 
distributions.  Specifically  the  null  hypothesis. 


is  tested  against  the  two 


sided  alternative  hypothesis, 


I 

i 


i 


] 

j 

1 

i . 


i 
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The  test  statistic  for  testing  the  null  hypothesis  against  the 
alternative  hypothesis  Is  Q,  where 


» A A 

K * ai  • 


o2(a)  o(a,§) 
o(o»b)  o2(e) 


and  where  the  variance-covariance  matrix  Is 


(8) 


o (a)  o(a,§) 

o (&.S)  o2(S)  , 


“V^)  * v(a2) 

Cov(a  ,6  ) + Cov(a  ,6  ) 
1 1 2 2 


Cov(a  ,8  ) + Cov(a  J 
11  2 x 

V(S  ) ♦ V(B  ) 

1 2 


Equation  (8)  Is  a quadratic  form  and  Is  approximately  distributed 
as  a Chi-square  variate  with  two  degrees  of  freedom,  see  for  example. 
Mood  (1950),  Rao  (1952),  or  Wilks  (1962).  That  Is, 


Q - x*(2). 


(10) 


An  Inspection  of  equation  (8)  shows  that  close  agreement  between  the 
two  distributions  yields  a small  statistic,  while  a large  difference 
between  the  two  yields  a large  statistic.  Therefore,  the  critical  region 
of  the  test  Is  the  upper  tall  of  the  x2-d1str1but1on.  Consequently,  to 
test  the  null  hypothesis  of  equation  (6),  compare  Q with  x2(1-Oi2).  If 
Q » xz(l-a,2),  reject  the  null  hypothesis  at  the  a-level  of  significance; 
otherwise  do  not  reject  the  null  hypothesis.  By  rejecting  the  null 
hypothesis,  we  are  saying  that  the  two  detection  distributions  are  not 
equal . 
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VI.  TEST  DISCRIMINATION 
A.  General 

In  the  previous  section  It  Mas  seen  that  the  determination  of  a 
difference  between  distributions  Is  dependent  upon  the  scale  parameter, 
a,  and  the  shape  parameter,  0.  For  this  study  It  was  decided  to  set  o 
equal  to  25  and  concentrate  our  efforts  on  the  shape  parameter,  B.  When 

B * 1,  the  Welbull  distribution  Is  equivalent  to  the  exponential  dlstrl- 

j 

butlon  and  when  0 ■ 3.5,  the  distribution  Is  approximately  normal.  Since 
the  shape  of  the  detection  distribution  Is  expected  to  be  within  this 
i range,  shape  parameter  values  between  1.0  and  3.5  are  studied. 

I B.  Sample  size  of  500 

| Test  performance  In  application  can  be  no  better  than  the 

! 

| asymptotic  power  of  the  test.  Because  no  Information  Is  available  on  the 

| power  of  the  test,  an  Initial  sensitivity  analysis  Is  performed.  Conse- 

quently, large  samples  having  a moderate  amount  of  censoring  are  first 
( studied. 

Welbull  distributions  of  sample  size  500  having  three  different 
percentages  of  censoring  (0,  10,  and  20)  were  generated  by  Monte  Carlo 
i simulation.  The  scale  parameter  was  arbitrarily  fixed  at  <x  ■ 25.  The 

, range  of  the  shape  parameter  values  (1.0  to  3.5)  was  divided  Into  five 

I 

I sub-ranges  of  length  0.5  each.  Within  each  sue  range  0 was  Incremented 

! in  steps  of  0.1  to  give  six  0-values,  e.o.,  (1.0,  1.1,  1.2,  1.3,  1.4,  1.5), 

: (1.5,  1.6,  1.7,  1.8,  1.9,  2.0),  ....  (3.G,  3.1,  3.2,  3.3,  3.4,  3.5).  For 

1 each  of  the  six  6-values,  a Welbull  distribution  was  generated  for  each 

i 

| of  the  three  percentages  censored.  This  gave  eighteen  distributions 

\ 

\ 

i 
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for  each  of  the. five  6-value  sub-ranges  or  a total  of  153  pair-wise 
comparisons.  For  completeness  and  anticipated  follow-on  analyses, 
summary  statistics  are  tabulated  in  APPENDIX  A.  TABLES  A-l  through  A-5 
contain  the  five  sets  of  summary  statistics  of  the  eighteen  distributions. 

Within  each  set  of  eighteen  distributions,  all  possible  (153)  compari- 
sons were  made  between  pairs  of  distributions.  That  1$,  the  null 
hypothesis  of  equality  of  the  two  distributions,  equation  (6),  was  tested. 
This  gave  153  Q-statlstlcs.  The  corresponding  6 differences  (Si-Si ityj; 
1*1,2,...,17;j*2,3,...,18)  were  calculated  and  paired  with  the  Q-statistlcs. 
Within  each  set  of  6 differences  and  Q-statistlcs,  six  different  combina- 
tions existed  between  the  percentages*  censored  In  the  two  distributions 
being  compared-(O.O) , (0,10),  (0,20),  (10,10),  (10,20),  and  (20,20).  The 
distribution  of  the  153  cases  over  the  six  combinations  Is  shown  In 
TABLE  1 below. 


TABLE  1 

CENSORING  DISTRIBUTION 


Combination  Percentage  Censored  Number  of 

Number  (Sample  j , Sample  i ) Samples 


1 

(0,0) 

15 

2 

(0,10)  or  (10,0) 

36 

3 

(0,20)  or  (20,0) 

36 

4 

(10,10) 

15 

5 

(10,20)  or  (20,10) 

36 

6 

(20,20) 

15 
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The  theoretical  relationship  between  the  e differences  and  Q Is 
parabolic.  Therefore,  a quadratic  In  $ differences  was  fitted  for  each 
of  the  six  combinations  In  TABLE  1,  using  8 differences  as  the  Independent 
variable  and  the  Q-statlstlc  as  the  dependent  variable.  Within  each  of 
the  five  e-value  sub-ranges,  the  quadratic  fit  for  each  of  the  six  censor- 
ing combinations  was  evaluated  for  Q - 5.991,  the  x*(?)  critical  value  for 
the  0. 05-level  of  significance.  This  gives  the  difference  between  the 
shape  parameters  of  two  distributions  which  would  be  declared  significant 
at  the  0. 05-level  of  significance.  The  largest  variation  among  each  set 
of  the  six  8 differences  was  0.04.  This  Is  well  within  the  variability 
of  the  generated  data.  The  six  combinations  of  each  0-value  sub-range 
were  then  "pooled"  ai.J  a quadratic  fit  was  made  to  each  of  the  five  sub- 
ranges of  the  153  § differences.  All  fits  were  "good";  the  coefficients 
of  determination  ranged  from  0.90  to  0.97.  Each  of  the  five  sub-range 
quadratic  regression  equations  was  then  evaluated  for  two  levels  of  signi- 
ficance (0.05  and  0.01)  or  Q ■ 5.991  and  Q ■ 9.210.  The  resulting  relation- 
ship between  b and  the  0 differences  detectable  for  the  two  significance 
levels  Is  graphically  Illustrated  In  FIGURE  3 on  the  following  page. 

FIGURE  3 suggests  a strong  linear  relationship  between  6 and  the  0 
difference  that  Is  detectable.  In  fact,  the  ratio  of  the  plotted  0 
differences  over  their  respective  sub-range  mid-points  Is  nearly  constant 
for  each  level  of  significance.  At  the  0.05-level  of  significance,  the 
ratio  Is  approximately  0.12:  at  the  0.01-level  of  significance,  It  Is 
approximately  0.15. 
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1.0  1.8  2.0  0 2.5  3.0  3.5 

FIGURE  3.  TEST  DISCRIMINATION  FOR  N-500 


FIGURES  A-l  through  A-5  of  APPENDIX  A pictorial ly  Illustrate  typical 
distributions,  within  each  of  the  five  sub-ranges,  which  are  statistic- 
ally different  when  equality  is  tested  at  the  0.05-1 evel  of  significance. 
Each  of  the  five  figures  contains  a plot  of  two  distributions,  taken 
from  the  samples  shown  in  TABLES  A-l  through  A-5,  respectively.  The 
distribution  having  the  smaller  shape  parameter  Is  drawn  with  a solid 
line  and  Its  shape  parameter  estimate  is  denoted  by  $i',  the  distribu- 
tion having  the  larger  shape  parameter  Is  shown  with  a dashed  line  and 
Its  shape  parameter  Is  denoted  by  8 . For  example  In  FIGURE  A-1, 
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samples  were  selected  from  a distribution  with  a * 1,1  (with  no  censoring) 

and  with  e - 1.2  (with  10%  censoring);  and  the  two  sample  estimates  of 

the  shape  parameter  are  8)  * 1.062  and  §2  ■ 1.227.  The  Q-statlstlc  for 

testing  the  null  hypothesis  of  equation  (6)  Is  also  given  on  each  figure. 

2 

In  each  case,  the  Q-statlstlc  1$  between  x (0.95,2)  - 5.991  and 
2 

x (0.99,2)  - 9.210.  That  Is,  the  level  of  significance  at  which  the  null 
hypothesis  would  be  rejected  Is  between  0.05  and  0.01.  The  five  figures 
Illustrate  the  test  discrimination  between  distributions  of  different 
shapes  over  a range  of  shape  parameter  values  from  1.0  to  3.5. 

C.  Sample  Size  of  250 

In  practice  large  samples  are  often  not  available.  Therefore, 
test  performance  for  two  smaller  samples  (N  ■ 250  and  N ■ 100)  are  studied. 
The  results  for  N - 250  are  presented  first. 

Welbull  distributions  of  sample  size  250  were  generated.  The 
same  scale  and  shape  parameters  and  the  same  percentages  of  censoring 
were  used  as  for  N s 500.  The  procedure  described  In  Section  A above  was 
repeated  using  N * 250.  The  summary  statistics  are  given  In  TABLES  B-1 
through  B-5  of  APPENDIX  B.  This  time  the  largest  variation  among  each 
set  of  the  six  B differences  was  0.07.  Again,  this  variation  Is  within 
the  variability  of  the  generated  data.  The  Beta  differences  obtained 
from  the  evaluations  of  the  five  quadratic  regression  equations  are 
given  In  TABLE  2 below.  As  before,  there  appears  to  be  a linear  relation- 
ship between  8 and  the  B difference  that  Is  detectable. 


j 
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TABLE  2 


w 

f 


Significance 

Level 

0.05 

0.01 


BETA  DIFFERENCES  FOR  N ■ 250 


1.5-1. 5 

1 ,5-2.0 

2.3-2. 5~ 

2. 5-3.6 

"'3. 0-3.5 

0.20 

0.29 

0.37 

0.45 

0.54 

0.24 

0.36 

0.47 

0.55 

0.67 

i= 

K 

£ 

r 
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The  test  discrimination  for  each  of  the  five  sub-ranges  Is  Illustrated 
In  FIGURES  B-l  through  B-5  In  APPENDIX  B.  The  notation  In  the  figures 

Is  the  same  as  that  described  In  the  previous  section.  The  distribution 

having  the  smaller  shape  parameter  estimate  Is  denoted  by  and  the  larger 
Is  denoted  by  §2.  The  significance  level  of  each  pair  of  Illustrated 
distributions  Is  between  0.05  and  0.01.  The  Q-statlstlc  Is  again  given 
on  each  of  the  five  figures. 

D.  Sample  Size  of  '00 

In  the  examination  of  the  test  performance  for  N ■ 100,  th  sub- 

ranges of  the  shape  parameter  values  had  to  be  reconstructed.  This  was 
because  the  Beta  difference  which  is  distinguishable  Is  larger  than  0.5 
for  shape  parameters  greater  than  1.5.  Therefore,  the  shape  parameter 
range  was  divided  Into  three  sub-ranges  rather  than  the  five  previously 
used.  The  three  sub-ranges  were  1.0-1. 5,  1.5-2. 5,  and  2. 5-3. 5.  Within 
the  first  sub-range,  b was  incremented  In  steps  of  0.1  as  before.  But 
within  the  two  larger  sub-ranges,  b was  incremented  in  steps  of  0.2. 

This  gave  six  B-values  for  each  of  the  three  sub-ranges.  The  summary 
statistics  of  the  three  sets  of  eighteen  distributions  are  given  In 
TABLES  C-1,  C-2,  and  C-3. 
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The  largest  variation  among  each  set  of  the  six  0 differences  was  0.08, 
again  within  the  variability  of  the  data.  The  Beta  differences  from  the 
three  quadratic  regressions  are  given  In  TABLE  3.  Test  discrimination  Is 
plctorlally  Illustrated  In  the  three  figures  of  Appendix  C. 

TABLE  3 

BETA  DIFFERENCES  FOR  N ■ 100 


Significance 

Level 


The  test  discrimination  for  all  three  sample  sizes  Is  shown  In 
FIGURE  4.  All  three  sample  sizes  exhibit  a linear  relationship  between  0 
and  the  0 difference  that  Is  detectable.  As  expected,  the  0 difference 
that  Is  detectable  Is  smaller  for  large  sample  sizes  than  the  0 difference 
that  Is  detectable  for  small  sample  sizes.  The  dependence  of  the  0 
difference  that  Is  detectable  upon  0 is  greater  for  small  sample  sizes 
than  it  Is  for  large  sample  sizes.  The  trend  of  the  lines  for  N » 100 
has  the  steepest  slope. 

VII.  CONCLUSIONS 

The  test  statistic  performed  satisfactorily  over  the  range  of  shape 
parameters  and  the  percentages  of  censoring  Investigated.  For  the  three 
sample  sizes  and  the  parameter  values  studied,  test  discrimination  Is  not 
degraded  when  censoring  does  not  exceed  twenty  percent  of  the  sample  size. 
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Therefore,  under  moderate  degrees  of  censoring,  the  Q>stat1st1c 
provides  a useful  test  statistic  for  testing  the  equality  of  two  fitted 
Welbull  distributions.  The  relationships  shown  In  FIGURE  4 between  ft 
and  the  e differences  that  are  distinguishable  can  serve  as  Indicators 
of  test  discrimination.  These  Indicators  should  be  of  value  when  design* 
ing  target  detection  experimentation  and  when  analyzing  target  detection 
data  In  which  all  exposed  targets  are  not  detected. 
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TABLE  A-l,  N • 500  and  Shape  Parameter  Equal  1.0  - 1.5 


Percent 


Censored 

a 

A 

a 

1 

A 

A 

m 

0 

25 

24.750 

1.0 

1.053 

24.251 

530.968 

10 

25 

27.050 

1.0 

1.028 

26.747 

677.018 

20 

25 

30.912 

1.0 

1.023 

30.625 

896.403 

0 

25 

25.219 

1.1 

1.062 

24.629 

538.161 

10 

25 

26.285 

1.1 

1.098 

25.379 

535.669 

20 

25 

27.943 

1.1 

1.219 

26.181 

465.978 

0 

25 

25.637 

1.2 

1.188 

24.179 

417.413 

10 

25 

26.702 

1.2 

1.227 

24.979 

418.800 

20 

25 

27.661 

1.2 

1.123 

26.515 

559.533 

0 

25 

25.975 

1.3 

1.279 

24.071 

359.644 

10 

25 

25.717 

1.3 

1.312 

23.708 

332.413 

20 

25 

27.895 

1.3 

1.346 

25.592 

369.171 

0 

25 

24.131 

1.4 

1.461 

21.857 

231.181 

10 

25 

25.234 

1.4 

1.518 

22.747 

233.317 

20 

25 

25.710 

1.4 

1.388 

23.466 

293.222 

0 

25 

25.669 

1.5 

1.502 

23.169 

246.870 

10 

25 

25.341 

1.5 

1.427 

23.029 

268.047 

20 

25 

28.123 

1.5 

1.473 

25.445 

308.610 

TABLE  A-2,  N «•  500  and  Shane  Parameter  Equal  1.5  - 2.0 


Sample 

Number 

Percent 

Censored 

a 

ft 

a 

£ 

A 

£ 

E(x) 

1 

0 

25 

24.023 

1.5 

1.599 

21.540 

2 

10 

25 

26.163 

1.5 

1.546 

23.537 

3 

20 

25 

27.159 

1.5 

1.560 

24.410 

4 

0 

25 

26.947 

1.6 

1.585 

23.284 

5 

10 

25 

26.201 

1.6 

1.763 

23.326 

6 

20 

25 

24.382 

1.6 

1.590 

21.874 

7 

0 

25 

24.338 

1.7 

1.700 

21.716 

8 

10 

25 

24.927 

1.7 

1.775 

22.183 

9 

20 

25 

25.551 

1.7 

1.836 

22.701 

10 

0 

25 

24.856 

1.8 

1.839 

22.083 

11 

10 

25 

26.119 

1.8 

1.795 

23.231 

12 

20 

25 

28.095 

1.8 

1.934 

24.917 

13 

0 

25 

24.585 

1.9 

1.899 

21.816 

14 

10 

25 

25.649 

1.9 

1.769 

22.830 

15 

20 

25 

26.921 

1.9 

1.842 

23.916 

16 

0 

25 

24.507 

2.0 

1.945 

21.732 

17 

10 

25 

25.258 

2.0 

1.954 

22.396 

18 

20 

25 

26.617 

2.0 

2.019 

23.585 

V(x) 

190.289 

241.584 

255.493 

225.728 

186.667 

198.292 

172.883 

166.842 

164.275 

154.996 

179.268 

180.238 

142.759 

177.822 

181.228 

135.675 

142.935 

149.435 
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TABLE  A-3,  N ■ 500  and  Shape  Parameter  Equal  2.0  - 2.5 


Sampl e 
Number 

Percent 

Censored 

a 

A 

a 

1 

A 

e 

E(x) 

V(x) 

1 

0 

25 

25.833 

2.0 

2.050 

22.885 

136.899 

2 

10 

25 

25.150 

2.0 

2.039 

22.282 

131.058 

3 

20 

25 

25.268 

2.0 

1.995 

23.280 

148.812 

4 

0 

25 

25.608 

2.1 

2.069 

22.683 

132.273 

5 

10 

25 

25.918 

2.1 

2.003 

22.968 

143.820 

6 

20 

25 

25.970 

2.1 

2.096 

23.002 

132.933 

7 

0 

25 

25.965 

2.2 

2.372 

23.012 

106.574 

8 

10 

25 

25.956 

2.2 

2.253 

22.990 

116.586 

9 

20 

25 

25.952 

2.2 

2.317 

22.993 

110.950 

10 

0 

25 

25.530 

2.3 

2.281 

22.615 

110.349 

11 

10 

25 

25.119 

2.3 

2.270 

22.251 

107.766 

12 

20 

25 

26.443 

2.3 

2.387 

23.439 

109.267 

13 

0 

25 

24.427 

2.4 

2.329 

21.643 

97.346 

14 

10 

25 

25.088 

2.4 

2.399 

22.240 

97.532 

15 

20 

25 

26.236 

2.4 

2.577 

23.298 

94.108 

16 

0 

25 

24.550 

2.5 

2.614 

21.809 

80.384 

17 

10 

25 

24.814 

2.5 

2.478 

22.012 

90.150 

18 

20 

25 

26.159 

2.5 

2.585 

23.231 

93.100 

Tabif  a-4,  N * 500  and  Shape  Parameter  Equal  2.5  - 3.0 


Sample 

Number 

Percent 

Censored 

a 

A 

£ 

1 

A 

JL 

E(x) 

V(x) 

1 

0 

25 

24.657 

2.5 

2.573 

21.894 

83.374 

2 

10 

25 

25.261 

2.5 

2.688 

22.461 

81.143 

3 

20 

25 

26.613 

2.5 

2.690 

23.664 

89.922 

TARLP  A-5,  N » 500  and  Shape  Parameter  Equal  3.0  * 3.5 


Sample 

Percent 

A 

£1x1 

Number 

Censored 

a 

A 

a 

i 

6 

1 

0 

25 

24.828 

3.0 

3.100 

22.204 

2 

10 

25 

25.701 

3.0 

3.144 

23.000 

3 

20 

25 

26.197 

3.0 

3.067 

23.417 

4 

0 

25 

23.978 

3.1 

3.007 

21.414 

5 

10 

25 

24.927 

3.1 

3.175 

22.317 

6 

20 

25 

25.758 

3.1 

3.136 

23.048 

7 

0 

25 

25.062 

3.2 

3.288 

22.477 

8 

10 

25 

25.815 

3.2 

3.176 

23.113 

9 

20 

25 

24.788 

3.2 

3.197 

22.200 

10 

0 

25 

25.489 

3.3 

3.251 

22.847 

11 

10 

25 

25.632 

3.3 

3.181 

22.950 

12 

20 

25 

25.384 

3.3 

3.410 

22.808 

13 

0 

25 

25.141 

3.4 

3.369 

22.5/5 

14 

10 

25 

25.163 

3.4 

3.673 

22.699 

15 

20 

25 

25.401 

3.4 

3.457 

22.840 

16 

0 

25 

24.879 

3.5 

3.437 

22.363 

17 

10 

25 

24.816 

3.5 

3.525 

22.337 

18 

20 

25 

25.355 

3.5 

3.674 

22.873 
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VM 

61.431 

64.230 

69.606 

60.328 

59.436 

64.801 

56.625 

63.718 

58.088 

59.696 

62.642 

54.600 

54.652 

47.311 

53.421 

51.742 

49.316 

48.005 
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TABLE  8-1,  N ■ 250  and  Shape  Parameter  Equal  1.0  - 1.5 


Percent 

Censored 

a 

A 

0 

£ 

A 

£ 

£i*l 

V(x) 

0 

25 

23.762 

1.0 

1.020 

23.573 

534.632 

10 

25 

26.378 

1.0 

1.127 

25.258 

504.394 

20 

25 

29.632 

1.0 

1.086 

28.718 

700.942 

0 

25 

25.476 

1.1 

1.134 

24.347 

462.878 

10 

25 

29.179 

1.1 

1.170 

27.631 

560.984 

20 

25 

26.487 

1.1 

1.158 

25.158 

474.698 

0 

25 

24.300 

1.2 

1.286 

22.494 

310.885 

10 

25 

27.358 

1.2 

1.339 

25.124 

359.439 

20 

25 

28.148 

1.2 

1.256 

26.186 

440.023 

0 

25 

25.737 

1.3 

1.393 

23.476 

291.445 

10 

25 

26.655 

1.3 

1.390 

24.322 

314.016 

20 

25 

23.056 

1.3 

1.258 

21.443 

294.347 

0 

25 

27.990 

1.4 

1.413 

25.474 

334.136 

10 

25 

24.890 

1.4 

1.392 

22.706 

272.956 

20 

25 

27.801 

1.4 

1.384 

25.384 

344.597 

0 

25 

22.362 

1.5 

1.482 

20.218 

192.794 

10 

4A 

25 

me 

26.176 

A*>  AAA 

1.5 

1.490 

23.651 

261.168 

TABLE  8-2,  N ■ 250  and  Shape  Parameter  Equal  1.5  - 2.0 


ample 

Percent 

umber 

Censored 

A 

1 

1 

E(x) 

V(x) 

1 

0 

25 

24.720 

1.5 

1.693 

22.062 

179.775 

2 

10 

25 

26.497 

1.5 

1.527 

23.870 

254.103 

3 

20 

25 

26.613 

1.5 

1.514 

23.998 

260.989 

4 

0 

25 

26.040 

1.6 

1.774 

23.174 

182.153 

5 

10 

25 

26.950 

1.6 

1.776 

23.982 

194.670 

6 

20 

25 

23.984 

1.6 

1.603 

21.500 

188.652 

7 

0 

25 

23.940 

1.7 

1.634 

21.424 

180.782 

8 

10 

25 

26.037 

1.7 

1.752 

23.187 

186.544 

9 

20 

25 

27.766 

1.7 

1.658 

24.819 

236.237 

10 

0 

25 

25.325 

1.8 

1.818 

22.511 

164.492 

11 

10 

25 

25.188 

1.8 

1.742 

22.440 

176.679 

12 

20 

25 

26.896 

1.8 

1.789 

23.926 

191.213 

13 

0 

25 

26.207 

1.9 

2.020 

23.222 

144.716 

14 

10 

25 

26.284 

1.9 

1.881 

23.331 

166.163 

15 

20 

25 

26.581 

1.9 

1.836 

23.617 

177.900 

16 

0 

25 

26.000 

2.0 

2.015 

23.039 

143.155 

17 

10 

25 

25.068 

2.0 

1.876 

22.254 

151.880 

18 

20 

25 

26.476 

2.0 

2.021 

23.460 

147.584 

TABLE  B-3,  N • 250  and  Shape  Parameter  Equal  2.0  - 2.5 


8 


Sample  Percent 


Humber 

Censored 

o 

A 

o 

£ 

i 

£& 

&}. 

1 

0 

25 

24.476 

2.0 

2.104 

21.678 

117.223 

2 

10 

25 

26.505 

2.0 

2.147 

23.473 

132.560 

3 

20 

25 

27.163 

2.0 

2.020 

24.066 

155.524 

4 

0 

25 

25.263 

2.1 

2.184 

22.373 

116.729 

5 

10 

25 

26.646 

2.1 

2.071 

23.602 

142.957 

6 

20 

25 

26.016 

2.1 

2.276 

23.045 

115.090 

7 

0 

25 

22.595 

2.2 

2.079 

20.014 

102.081 

8 

10 

25 

25.165 

2.2 

2.082 

22.290 

126.272 

9 

20 

25 

27.227 

2.2 

2.252 

24.116 

128.451 

10 

0 

25 

25.664 

2.3 

2.216 

22.730 

117.367 

11 

10 

25 

25.207 

2.3 

2.556 

22.378 

88.159 

12 

20 

25 

26.006 

2.3 

2.419 

23.058 

103.245 

13 

0 

25 

24.950 

2.4 

2.292 

22.103 

104.538 

14 

10 

25 

26.426 

2.4 

2.447 

23.435 

104.468 

15 

20 

25 

25.957 

2.4 

2.374 

23.006 

106.331 

16 

0 

25 

25.662 

2.5 

2.490 

22.767 

95.563 

17 

10 

25 

26.282 

2.5 

2.416 

23.302 

105.697 

18 

20 

25 

26.208 

2.5 

2.417 

23.236 

105.035 

TABLE  B-4,  N * 250  and  Shape  Parameter  Equal  2.5  - 3.0 


Sample 

Number 

1 

2 

3 

Percent 

Censored 

o 

ft 

£ 

i 

£L*1 

V(x) 

0 

10 

20 

25 

25 

25 

24.262 

25.532 

26.152 

2.5 

2.5 

2.5 

2.535 

2.565 

2.437 

21.534 

22.668 

23.190 

82.838 

89.858 

103.054 

4 

5 

6 

0 

10 

20 

25 

25 

25 

25.741 

24.870 

26.853 

2.6 

2.6 

2.6 

2.465 

2.569 

2.774 

22.831 

22.082 

23.903 

97.911 

85.021 

86.829 

7 

8 
9 

0 

10 

20 

25 

25 

25 

25.724 

25.816 

26.075 

2.7 

2.7 

2.7 

2.653 

2.736 

2.904 

22.862 

22.968 

23.252 

86.062 

82.166 

75.714 

10 

11 

12 

0 

10 

20 

25 

25 

25 

25.143 

26.269 

26.334 

2.8 

2.8 

2.8 

2.687 

2.748 

2.914 

22.356 

23.375 

23.486 

80.425 

84.475 

76.778 

13 

14 

15 

0 

10 

20 

25 

25 

25 

24.546 

26.268 

27.227 

2.9 

2.9 

2.9 

2.704 

3.026 

2.913 

21.830 

23.466 

24.283 

75.828 

71.612 

82.091 

16 

17 

18 

0 

10 

20 

25 

25 

25 

25.083 

25.996 

26.109 

3.0 

3.0 

3.0 

2.901 

3.100 

3.176 

22.367 

23.248 

23.376 

70.178 

67.300 

65.154 
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8 APPENDIX  C 

| SAMPLE  SIZE  OF  100 
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4 
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F 

4 
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TABLE  C-l,  N ■ 100  and  Shape  Parameter  Equal  1.0  - 1.5 


Percent 


Censored 

a 

A 

a 

A 

i 

KxJ. 

0 

25 

23.178 

1.0 

1.099 

22.373 

10 

25 

28.484 

1.0 

0.997 

28.516 

20 

25 

30.436 

1.0 

1.082 

29.537 

0 

25 

23.569 

1.1 

1.030 

23.286 

10 

25 

27.753 

1.1 

1.171 

26.276 

20 

25 

26.699 

1.1 

1.137 

25.498 

0 

25 

29.079 

1.2 

1.164 

27.580 

10 

25 

24.477 

1.2 

1 348 

22.451 

20 

25 

23.936 

1.2 

1.265 

22.235 

0 

25 

21.880 

1.3 

1.306 

20.189 

10 

25 

25.352 

1.3 

1.295 

23.432 

20 

25 

30.068 

1.3 

1.391 

27.433 

0 

25 

25.828 

1.4 

1.352 

23.679 

10 

25 

25.178 

1.4 

1.468 

22.791 

20 

25 

25.076 

1.4 

1.356 

22.978 

0 

25 

24.874 

1.5 

1.642 

22.251 

10 

25 

25.526 

1.5 

1.381 

23.315 

20 

25 

27.881 

1.5 

1.481 

25.209 

V(x) 

415.600 

817.443 

747.180 

511.188 

506.755 

505.449 

565.090 

283.365 

313.330 

243.187 

332.685 

399.079 

313.640 

249.206 

293.746 

193.331 
291.978 
299.958 


TABLE  C-2,  N » 100  and  Shape  Parameter  Equal  1.5  - 2.5 


Sample 


Percent 


mr 


m 


Number 

Censored 

a 

& 

£ 

i 

m 

V(x) 

1 

0 

25 

25.677 

1.5 

1.970 

22.763 

145.523 

2 

10 

25 

25.802 

1.5 

1.470 

23.353 

261.083 

3 

20 

25 

28.284 

1.5 

1.739 

25.199 

223.403 

4 

0 

25 

24.982 

1.7 

1.761 

22.242 

170.148 

5 

10 

25 

26.609 

1.7 

2.069 

23.570 

142.841 

6 

20 

25 

30.676 

1.7 

1.856 

27.243 

231.938 

7 

0 

25 

25.189 

1.9 

1.884 

22.358 

152.180 

8 

10 

25 

26.307 

1.9 

2.115 

23.299 

134.137 

9 

20 

25 

22.409 

1.9 

1.763 

19.949 

136.580 

10 

0 

25 

23.490 

2.1 

1.890 

20.848 

131.500 

11 

10 

25 

26.654 

2.1 

2.090 

23.608 

140.722 

12 

20 

25 

23.775 

2.1 

2.130 

21.056 

108.133 

13 

0 

25 

25.297 

2.3 

2.182 

22.403 

117.322 

14 

10 

25 

24.436 

2.3 

2.307 

21.649 

99.114 

15 

20 

25 

24.798 

2.3 

2.561 

22.017 

85.016 

16 

0 

25 

24.991 

2.5 

2.298 

22.140 

104.409 

17 

10 

25 

25.673 

2.5 

2.596 

22.802 

88.972 

18 

20 

25 

27.186 

2.5 

2.636 

24.157 

97.178 
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TABLE  C-3,  N ■ 100  and  Shape  Parameter  Equal  2.5  - 3.5 


P 


< 


Sample  Percent 


Number 

Censored 

a 

A 

a 

1 

0 

25 

25.572 

2 

10 

25 

27.798 

3 

20 

25 

25.351 

4 

0 

25 

25.412 

5 

10 

25 

23.864 

6 

20 

25 

25.472 

7 

0 

25 

22.876 

8 

10 

25 

24.891 

9 

20 

25 

25.791 

10 

0 

25 

24.755 

11 

10 

25 

25.776 

12 

20 

25 

26.556 

13 

0 

25 

27.079 

14 

10 

25 

26.633 

15 

20 

25 

25.906 

16 

0 

25 

25.802 

17 

10 

25 

25.686 

18 

20 

25 

25.569 

1 

1 

El&L 

ikl 

2.5 

2.806 

22.773 

77.236 

2.5 

2.673 

24.712 

99.212 

2.5 

2.774 

22.566 

77.412 

2.7 

3.021 

22.699 

67.207 

2.7 

2.823 

21.256 

66.566 

2.7 

2.836 

22.693 

75.244 

2.9 

2.716 

20.347 

65.350 

2.9 

3.496 

22.394 

50.329 

2.9 

2.498 

22.883 

96.030 

3.1 

2.938 

22.086 

66.896 

3.1 

3.119 

23.059 

65.494 

3.1 

3.371 

23.847 

60.940 

3.3 

3.637 

24.414 

55.696 

3.3 

3.356 

23.910 

61.756 

3.3 

3.346 

23.254 

58.731 

3.5 

3.421 

23.188 

56.100 

3.5 

3.830 

23.225 

45.922 

3.5 

3.434 

22.983 

54.747 

I 


i 

\ 
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FIGURE  C-2,  N = 100  and  Beta  Between  1.5  and  2.5 


ON  THE  ROBUSTNESS  OF  THE  EXPONENTIAL  DISTRIBUTION 

George  C.  Canavos 
Virginia  Commonwealth  University 
Richmond.  Virginia 


ABSTRACT.  This  paper  examines  the  robustness  of  the  expon- 
entiai  time-to-failure  distribution  when  this  probability  law  is 
compared  against  some  logical  alternatives  such  as  the  Veibull 
and  gamma  distributions  relative  to  estimation  procedures  involving 
the  scale  parameter. 


1.  INTRODUCTION . Since  the  pioneering  work  on  life  test- 
ing and  reliability  estimation  during  the  early  1950 's  - see,  for 
example,  [1]  and  [2]  - the  exponential  distribution  has  been  the 
most  widely  assumed  probability  law  in  describing  times  to  failure 
of  many  types  of  components  and  systems.  There  is  little  doubt 
that  this  distribution  has  played  a key  role  in  both  theory  and 
application  over  the  past  twenty  or  so  years.  Surely,  therefore, 
it  is  of  continued  interest  to  query,  "What  if  the  assumption 
of  the  exponential  probability  law  does  not  hold?  To  what  extent 
then  will  such  an  occurrence  affect  subsequent  Inferences  and 
estimation  procedures  derived  as  a result  of  and  depending  on 
this  assumption?" 

A substantive  study  on  the  robustness  of  the  exponential 
distribution  is  hereby  attempted.  Where  possible,  the  treatment 
is  analytic.  Particular  attention  is  given  to  the  estimation  of 
the  scale  parameter  and  the  ramifications  regarding  the  mean- 
squared  error  (1ISE)  of  its  estimate  if  the  exponential  assumption 
does  not  hold.  The  effect  on  the  MSE  is  determined  as  a function 
of  a situation  in  which  the  true  sampling  distribution  of  life- 
times is  not  the  assumed  exponential  but  rather  is  either  a 
Welbull  or  a gamma.  By  following  such  a procedure,  the  degree  of 
robustness  of  the  exponential  distribution  is  measured  and  quan- 
tified. 


2.  THEORETICAL  DEVELOPMENT  OF  ROBUSTNESS.  Let  x1(x2 xn 

denote  the  times-to-fallure  of  n like  items.  Assume  that  these 
lifetimes  follow  the  exponential  distribution  with  probability 
density  function  (pdf) 

f (x; 6 ) - ^ exp(-9x)  , x > 0 (1) 

where  Interest  is  on  the  estimation  of  the  parameter  9.  By 
appealing  to  the  likelihood  function 
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PrtcedMt  pap  blank 


- Z xi 

Kxi.xt xn;0)  - ^ exp  ( — ^ ) 


one  can  easily  determine  the  minimum  variance  unbiased  estimate 
(MVUE)  of  6 to  be 


A 


e 


n 

I 

i-1 


(2) 


Suppose,  however,  that  in  reality  the  lifetimes  Xi,xa,...,xn  are 
realizations  of  a Weibull  random  variable  with  pdf 

h(x; 0 ,a)  « | x0-1  exp  xa)  , x > 0 (3) 


where  a is  a shape  parameter.  Again  it  is  a rather  straightfor 
ward  procedure  to  determine  that  the  MVUE  of  9 in  this  case  is 


l 


i-1 


(4) 


Thus,  if  in  reality  the  lifetimes  follow  the  Weibull,  the  optimal 
efficiency  (in  the  classical  sense)  for  estimating  6 is  provided 
by  the  MSE  of  the  MVUE  estimator  (4)  which  reduces  to  ‘ 

MSE(0)w  - {£.  (8) 


Since  the  exponential  distribution  was  assumed  to  accurately 

represent  the  lifetimes  x«,  however,  the  estimate  of  9 

is  determined  by  (2).  Thus,  what  effect  would  the  fact  that  the 
lifetimes  follow  the  Weibull  as  opposed  to  the  exponential  have 
on  the  MSE  of  the  estimator  given  by  (2)?  That  is,  if  in  reality 

xj.xa xn  follow  the  Weibull  with  pdf  given  by  (3),  then  for 

equation  (2) 


MSE( 0 ) - var(9 ) + E{0  - E(0)}J 


where 


E(§)  - E(  l -£i) 
i-1  n 

- 0l/a  p(l  + Ij 

and 
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Y 

•r 


var<§)  ■ var  ( l •£*■) 
i-l  n 

' k i!i  v,r  <Xl> 

- e2/0  {r<i  + f)  - r4<i  £)} 

Hence  after  none  algebraic  manipulation,  the  MSB  with  respect  to 
the  indicated  perturbation  is  expressed  by 

MSE(§)_ - 

1 1+I  (6) 

e2/a{rci  + h * (n  - i)ra<i  + £)}  - 2n0  °r(i  + h + nea 


where  the  notation  "E|W"  indicates  assumed  exponential  but  in 
reality  Veibull  sampling.  A comparison  between  equations  (5)  and 
(6)  provides  a measure  of  robustness  relative  to  MSE  in  the  assump- 
tion of  the  exponential  distribution  when  estimating  the  scale 
parameter  9.  Numerical  results  are  given  in  the  next  section. 

Analogous  to  the  previous  discussion,  consider  now  the  gamma 
distribution.  As  before,  assume  the  lifetimes  xi,x2,...,xn  follow 
the  exponential  with  pdf  given  by  (1).  What  are  the  consequences 
relative  to  the  MSE  of  (2)  if  in  fact  the  more  appropriate  proba- 
bility law  is  the  gamma  with  pdf 


g(x;8 ,a) 


1 

r(o)9a 


x01’1  exp  (-■£>  , x > 0 . 


First,  with  respect  to  (7),  it  is  easy  to  show  that  the  MVUE  of 
9 is 

" x 4 


while 


MSE( 8) 


fi  an 


Then  to  determine  the  MSE  of  (2),  consider 


E(§)  - E(  l -Jt) 
i-l  n 


247 


while 


n 


var(6)  - var  ( l -~) 
i-1  n 


P j,  (at) 


a62 
n ‘ 

Thus,  the  perturbed  MSE  of  (2)  reduces  to 


MSE( 6 ) 


B | G 


82{a  + o(l  - a)2} 


n 


(10) 


As  before,  the  comparison  between  equations  (9)  and  (10)  should 
reveal  the  degree  of  robustness  of  the  exponential  distribution 
as  measured  by  the  MSE  of  the  scale  parameter  8 . 


3.  NUMERICAL  RESULTS.  To  evaluate  the  robustness  of  the 
exponential  with!  regard  to  the  estimation  of  the  scale  parameter 
when  the  true  sampling  distribution  is  the  Weibull,  the  ratio  of 
equation  (8)  to  equation  (5)  is  formed.  The  notion  here  is  that 
since  in  reality  the  lifetimes  follow  the  Weibull  time-to-failure 
probability  law,  then  the  best  efficiency  of  the  MVUE  of  0 is 
provided  by  (5).  Thus  the  "perturbed"  MSE  given  by  (6)  should  be 
compared  to  (5).  Table  1 contains  this  ratio  computed  for  several 
values  of  6,  a and  the  sample  size  n. 

By  a similar  argument,  the  ratio  of  equation  (10)  to  equation 
(9)  is  formed  to  quantify  the  robustness  of  the  exponential 
relative  to  the  gamma  distribution.  However  in  this  case,  the 
ratio  is  the  simple  expression  given  by 


02{ot  + n(l  - a)2}/n 
" 02/na  ' 

» a{a  + n( 1 - a) 2 } 


which  is  seen  to  be  independent  of  the  value  of  0.  For  various 
values  of  a and  n,  this  ratio  is  given  in  Table  2. 


4.  CONCLUDING  REMARKS.  Based  on  the  results  contained 
herein,  it  is  apparent  that  relative  to  the  estimation  of  the 
scale  parameter,  the  exponential  distribution  is  extremely  sensi 
tive  if  in  reality  the  Weibull  is  the  sampling  distribution  and 
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Table  1 


Ratio  of  MSE(6)E|W  to  MSE(§)W 


l 's\  • 

•X 

0.8 

0.9 

1.10 

1.20 

— 

1.50 

2.00 

2.50 

! 

1 5 

0.97 

2.29 

0.71 

0.75 

1.24 

1.86 

2.21 

i 10 

11.60 

2.93 

0.74 

0.93 

1.77 

2.61 

3.03 

15 

15.46 

3.39 

0.77 

1.05 

2.07 

2.99 

I 20 

18.87 

3.76 

0.80 

1.15 

2.27 

3.23 

3.64 

! 25 

21.96 

4.08 

0.82 

1.22 

2.43 

3.39 

3.80 

30 

24.81 

4.35 

0.84 

1.29 

2.56 

3.52 

3.92 

! 35 

27.48 

4.60 

0.86 

1.34 

2.66 

3.62 

30.00 

4.83 

0.87 

1.39 

2.74 

3.70 

4.08 

i 45 

32.39 

5.03 

0.89 

1.43 

2.81 

3.77 

4.14 

i 50 

1 

34.68 

5.22 

0 . 90 



2.88 

3.83 

4.19 

i 

! n - 10 

t 

! 5 

9.38 

2.62 

0.85 

1.15 

2.36 

3.69 

4.40 

10 

16.75 

3.58 

0.93 

1.58 

3.46 

5.20 

6.05 

i 16 

23.02 

4.28 

1.07 

1.86 

4.08 

5.96 

6.82 

20 

28.61 

4.86 

1.15 

2.07 

4.51 

6.44 

7.28 

25 

33.71 

5.35 

1.21 

2.24 

4.82 

6.78 

7.00 

30 

38.45 

5.79 

1.27 

2.37 

5.07 

7.03 

7.83 

35 

42.89 

6.18 

1.31 

2.49 

5.28 

7.23 

8.01 

40 

47.10 

6.54 

1.36 

2.60 

5.45 

7.40 

8.16 

45 

51.11 

6.87 

1.39 

2.69 

5.00 

7.54 

8.28 

50 

54.94 

7.18 

1.43 

2.77 

5.73 

7.05 

8.38 

n ■ 20 


5 

14.20 

3.29 

1.13 

1.94 

8.79 

10 

27.05 

4.86 

1.45 

2.87 

1 

12.09 

15 

38.14 

6.06 

1.68 

3.47 

KJS 

13.63 

20 

48.10 

7.04 

1.85 

3.91 

8.96 

8 I'M:'  w; 

14.55 

25 

57.23 

7.90 

2.00 

4.26 

9.60 

13.55 

15.19 

30 

65.73 

8.65 

2.12 

4.55 

10.10 

14.06 

15.66 

35 

73.72 

9.34 

2.22 

4.80 

10.52 

14.46 

16.02 

40 

81.30 

9.96 

2.32 

5.01 

10.87 

14.79 

16.31 

45 

88.53 

10.54 

2.40 

5.20 

11.16 

15.07 

16.55 

50 

95.45 

11.09 

2.48 

. j 

5.37 

11.43 

15.31 

16.75 
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the  shape  parameter  is  less  than  one.  However,  there  is  a modest 
range  of  the  shape  parameter  - say  (1.0, 1.3)  - for  which  there 
is  substantial  robustness  on  the  part  of  the  exponential  distri- 
bution. Moreover,  the  robustness  is  more  apparent  for  smaller 
smaple  sizes  and  smaller  values  of  8 . 

For  the  case  involving  the  gamma  distribution,  to  some 
extent  the  opposite  appears  to  hold.  That  is,  for  values  of  the 
shape  parameter  that  are  less  than  unity,  considerable  robustness 
is  apparent  especially  for  small  sample  sizes  with  only  a modest 
amount  present  in  the  neighborhood  but  on  the  positive  side  of 
one. 


Table  2 


Ratio  of  MSEC  § )E | G to  MSE(§)Q 


a 

5 

10 

20 

0.50 

0.875 

1.50 

2.75 

0.60 

0.840 

1.32 

2.28 

0.70 

0.805 

1.12 

1.75 

0.80 

0.800 

0.96 

1.28 

0.90 

0.855 

0.90 

0.99 

0.95 

0.914 

0.93 

0.95 

1.00 

1.000 

1.00 

1.00 

1.10 

1.265 

1.32 

1.43 

1.20 

1.680 

1.92 

2.40 

1.40 

3.080 

4.20 

6.44 

1.60 

5.440 

8.32 

14.08 
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RANDOM  INTERVAL  RELIABILITY 


Gerald  R.  Andersen 
Headquarters,  U.S.  Army  Materiel 
Development  and  Readiness  Cmd. 
5001  Eisenhower  Ave. 
Alexandria,  Virginia 


Abstract.  Simple  expressions  are  derived  for  interval 
reliability  when,  in  addition  to  random  life  and  repair  times, 
the  time  of  request  for  system  availability  and  the  duration 
of  the  mission  occasioned  by  that  request  are  random  variables, 
rather  than  numerical  constants.  The  results  constitute  a 
simple  generalization  of  the  interval  reliability  results  noted 
in  Barlow  and  Proschan  [11. 

The  investigation  was  motivated  by  the  desire  to  discourage 
the  extensive  misapplication  of  the  result  of  [ 1 ] p.  82  in 
setting  reliability  values  for  large  scale  Army  systems  in  pre- 
development requirements  documents. 

1.  Introduction.  Let  r be  a stochastic  process  whose  value, 
r(t),  at  a particular  time  t£0,  describes  the  operating  state  of 
some  system  at  time,  t.  We  will  only  consider  systems  with  two 
states,  u£  (operable/operating)  or  down  (in  repair).  Specifically, 
we  will  say  that  the  system  is  up  at  time  t if  F(t)"l  and  down 
at  time  t if  r(t)«=0.  We  assume  that  r(0)*sl  with  probability  one. 

Starting  at  time  t-0,  let  ,Xj ,Yj * • • • denote  the  successive 

lengths  of  time  that  the  process,  T,  spends  in  the  up  or  down 
state,  respectively. 

Let 

Tv  * VYv  ' (1-l> 

s -0  and  define  S_  by  setting 
o n 


Throughout  most  of  this  note  each  of  the  sequences  {X^},  and 
{Y^}  will  consist  of  independent  and  identically  distributed  (IID) 
r.v.'s.  In  this  case,  { Sn } is  the  usual  type  of  renewal  process 
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fr? t.yVi.  ..  • - 


used  to  study  ays tans  where  the  X^'s  are  the  tinea  to  failure  and 
the  Yj ' a are  the  tinea  to  replacement  or  to  repair  to-original- 
condition. 

Associated  with  this  renewal  process,  (S  },  is  the  oounting 
process  N(t),  where 

N(t)  - k and  * Sk  (1.3) 

if,  and  only  if, 

3k  < t < Sk4l  <M> 

The  "residual  life4' process,  £(t),  defined  by  setting 


C(t)  - SM{t)  + XN(t)+l  “ t *1,5> 

(t>0)  is  useful  in  investigating  the  probability  that  T(t)*l 
during  various  intervals  of  tine. 

Since  N(t)  represents  the  number  of  times  the  process  r(t) 
returns  to  the  up  state  during  the  interval  (0,t) , the  event  that 
C(t)>X  coincides  with  the  event  that  the  system  is  in  the  up 
state  at  time  t and  remains  in  that  state  for  at  least  X units  of 
time 


+■ 

0 


SN(t) 


* 


■> 

-♦ 


SN(t)+XN(t)tl 


< > 

SN(t)+l 


In  section  2 we  will  obtain  exact  and  asymptotic  expressions 
for  the  probability  that  C(t)  exceeds  the  quantity  M when  both 
t and  M are  random  variables.  This  probability,  that  the  system 
is  up  throughout  the  interval  [t,  t+M] , is  called  interval 
reliability  by  Barlow  and  Proschan  [1  j p.  82,  in  the  case  where 
t and  M are  non-random.  It  is  interesting  to  note  that  many  Army 
documents,  including  a guide  on  reliability  techniques  1 10], 
apply  the  result  in  ( 1 ] but  with  the  claims  that  either  t or  M 
are  random. 


The  mathematics  required  to  make  this  extension  from  the 
well-known  results  in  Barlow  and  Proschan,  or  Gnedenko  [ ] , or 
Feller  [||]  is  very  simple,  but  in  some  ways  the  results  are 
reasonably  interesting.  In  spite  of  this,  it  is  doubtful  that 
one  would  announce  the  results  of  such  a simple  task  if  it  were 
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not  for  the  insane  realism  that  some  practitioners  of  reliability 
inject  into  reliability  "requirements*1  as  deduced  from 
mathematical  facts  about  residual  life.  This  topic  is 
expanded  on  in  Example  A of  section  2. 

In  section  3,  we  note  the  well-known  fact  that  an 
ayamptotio  result  of  section  2 is  the  limit  of  a statistic 
which  gives  the  percentage  of  time,  during  n renewals,  that  the 
system  is  up  and  remains  up  for  a sufficient  amount  of  time  to 
support  a mission  of  duration  J6  . A result  is  then  stated 
concerning  the  asymptotic  normality  of  a similar  statistic 
(one  representing  the  percentage  of  up-time  that  the  system 
is  available  for  a mission  of  duration  3*  ) . 

Section  4 is  an  attempt  to  consider  the  interval  reliability 
problem  when  successive  system  life  and  repair  times  are  not 
identically  distributed. 


2,0  Reiddual  life;  independent  and  identically  dlstrjUauted 
gg|£.  Let  the  sequences  IX^)  and  t y j > of  ieSHon  1 be  55qy5n551i 

of  independent  and  identically  distributed  positive  random  variables 
(r.v.'s)  and  assume  also  that  (X^J  and  (Yj)  are  independent  of 

each  other.  Thus,  in  this  section,  the  X^'s  have  the  usual 

interpretation  of  time  to  system  failure  and  the  Yj's  the  time  to 

replace  or  repair  the  system  to  a state  whioh  is  as  good  as  new. 

We  will  denote  the  common  distribution  function  (d.f.)  of  the 
X^'s  by  0,  of  the  Y^'s  by  H and,  where  appropriate,  use  X to  refer 

to  one  of  the  X^'s  and  Y to  one  of  the  Yj's.  Set  F equal  to  the 

d.f.  of  T ■ X+Y . Let  the  positive  r.v. 's  t and  M of  section  1 be 
independent  of  each  other  and  of  the  sequences  {X^}  and  (Y^>. 

Denote  the  d.f.'s  of  t and  M by  X and  L,  respectively.  Although 
termed  a positive  r.v.,  M will  be  allowed  to  take  the  value  zero 
with  positive  probability)  especially,  the  case  M»0  with  probability 
one  (a.s.).  This  allows  "availability"  as  well  as  interval 
reliability  statements  to  be  included  in  the  same  expression. 

When  M«0  (a.s.),  the  L«e,  where  c will  denote  the  unit  d.f.'  t 


e(y)  « J D»  if  y < ° (2.1) 

l 1,  if  y > 0 

To  avoid  needless  complications,  we  suppose  that  K(0)»0 
and  G(0)»0  (the  latter  guarantees  that  passage  of  the  system  from 
one  down  state  to  the  next  is  never  instantaneous).  Zt  follows 
that  F (0) »0 . Let 

U(t)  - ? F*k(t) , (2.2) 

k-1 

where  F**  denotes  the  k-  fold  convolution  of  F with  itself.  It 
is  well-known  that  the  renewal  function  U(t)  < +°°  for  each  t 
(0  < t < +•)  and  U(t)  - EN (t)  (cf.  section  1).  Consult  Feller  [ f | ] 
for  facts  about  U,  but  note  that  his  U counts  So«0  as  the  first 

renewal  of  the  process  {Sn>  and  so  equals  1+U,  u being  given  as 

in  (2.2).  The  definition  (2.2)  follows  most  "applied*  probability 
and  reliability  texts  (e.g.  (11,  Ml],  [*)]). 

The  physical  meaning  of  these  four  sets  of  r.v.'s  is  as  stated 
in  section  1.  Mathematically,  since  we  have  assumed  that  all  r.v.  's 
are  independent,  we  can,  without  loss  of  generality,  take  them  to  be 
defined  on  the  same  probability  space,  ft  . 
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Set  Q equal  to  the  d.f.  of  the  r.v.  Z - X-Mio  that 

Q(s)  ■ /**G(z+y)dL(y) 
o- 

for  all  z in  (-»,*•) . 


(2.3) 


Results i We  shall  now  state  and  discuss  the  results  of  this 
section j If  a proof  is  cumbersome  it  is  placed  at  the  end  of  the 
section. 


Theorem  1.  If  BX#  BY  and  Bt  are  finite,  then 


PU->  M)  • 7(X(z)+/(X(z+s)-X(s))dU(s))dQ(z)  (2.4) 

• oo 

Thus,  (2.4)  gives  the  probability  that  the  system  is  up  at 
some  randomly  selected  moment  in  time,  t,  and  remains  up  for  a 
random  duration,  M,  of  the  mission  occassioned  by  the  request  at 
time  t.  By  specifying  onl£  the  d.f.  of  t in  Theorem  1,  we  have 
the  following 

Corollary  1.  If  the  request  time  t is  exponentially  distributed 


PU_>  M)  - (l-f(X))  * / (1-e  A‘)dQ(z) 

T o 

where  $ is  the  Laplaca-Stieltjes  transform  of  the  d.f.  F 


To  verify  the  corollary  from  (2.4)  just  note  that 


k(s+z)  - k(s) 


rX,(l-e'u) , 


so  that  P(?T  > M)  - £“{(l-e“Xa)  + (l-e"Xa)  7 e"XadU (s) }dQ(z) 


- ( 1+U ( X ) ) 7"(l-e“A*)dQ(z) , 

. 0 

where  U is  the  Laplace-Stieltjes  transform  of  U.  Equation  (2.5) 
follows  since  0(X)  ■ $ (X)/ (1-F (X) ) for  all  X>0  (recall  that  F(0)»o) 

Remark  1»  It  is  both  Intuitively  and  analytically  obvious  that 
(2.5)  may  be  written  in  the  form 


P(C  > M)  - P (t  < x-m|t  < X+Y). 


(2.6) 


Intuitively,  because  the  exponential  distribution  has  no  memory 
and  analytically,  because 

7**(l-e"Xa)dQ(z)  - P(t  < X-M)  - P(t  < X-M,  t < X+Y) 
o 


and 


1 - $tx)  * P(t  £ x+y) . 


of  ***•  •xPon«ntial  assumption  on  t 
by  ''otln,  ““ lf  x *•  t*k,n  « *>• » 


K<«)  - Za  (l-a“*v*) , 


(2.7) 


av  > 0.  for  all  v,  Zay-  1,  Xv  > 0 (that  ia,  tha  tail  of  K can  be 
•xpraaaad  aa  a Dirichlat  aariaa) . Than  (2.3)  praaarvaa  in  tha  form 

PUX>  M)  - Z av(l-P(Xv))-1  /"( l-a-M)dQ(a)t 
^ 0 

Remark  3.  Sat  Q equal  to  tha  d.f.  of  (X-M)+  where  q+ 
tha^IuneEion  whiSh  aquala  S if  s>0  and  0 if  8 < 0.  Than  since 

»%.„  rtpu«*a*by°6+ ) n«°  • ■’  °+  *nd  ° diff*r- 


p<«T<x)>  M>  " 


(2.8) 


SMiidrSut°frS21?2,«??*aJ*<.a?!y  °™?Putati°»  (simulation  ia  easily 

JT&XHKt'fcJUf  X“  followln?  ob*'™tio- 

Pt5rU)>  m>  * E'X-M)+/  (Wj+Uj)  (2.9) 

where  y^  ■ EX,  y2  ■ EY<+«  and  E(x-M)+  £ EX  < +». 

ouotiSJi  STni^  Sh*  RHS<of  {2'*l  “ th*  r*tio  of  <*•  difference 
quotients  of  Q+  and  F»  passing  to  the  limit  as  X+0+  gives  the 

sti°4f.S5*E;*“:.K  (x‘M,+  #nd  x+y  (''hi°h  both  •*»« 


expect'  tJ?«  limit  in  (2.9)  is  preserved  if  the 
exponantiality  of  request  time  ia  dropped  and  x(X)  is  raciarAd 
by  any  sequence  {tn>  which  converges  in  probability  to  +S. 


Theorem  2.  Let  yx-EX  and  u2"EY  be  finite  and  T non-lattice,  if 

Tn  <n-1J  ie,  e aequence  of  positive  r.v.'s  and  in  orobabilitv. 

then  n 

P(CTft>  M)  - E(X-M)+/(y1+U2)  (2.10) 


as  n ■*  +•. 

(The  proof  is  at  the  end  of  this  section.) 
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Remark  4.  A simple  calculation  showa  that 


B(X-M)  + - B(X-M|X>M)  P(X>M)  (2.11) 

Also,  if  we  let  the  minimum  of  two  real  number a a and  b ba  denoted 
by  Mb  and  obaerve  the  identity 

(e-b)*  • a-aAb, 


then  we  can  expreaa  (2.10)  in  the  following  two  equivalent  forma 

p(?  > M)  «•  E^^1X>M)  • P (X>M)  - (2.12) 

Tn  Mrw2  yl*M2 


aa  n +•. 


When  M-0  a.a.i  the  RHS  of  both  (2.10)  and  (2.12)  reduce  to 
the  so-called  "availability**  of  the  system} u^/fy^+yj) . The  last 

relation  in  (2.12)  ia  therefore  especially  intuitive  since  it 
shows  directly  the  amount  by  which  the  availability  should  be 
decreased  if  one  wants  to  account  for  the  system  being  up  through- 
out a mission  of  (random)  duration,  M. 


In  view  of  the  above,  it  would  seem  to  be  appropriate  to  call 
A(N)  - ECX-MI  + 


system  availability  for  missions  of  length  M. 


(2.13) 


Remark  S,  When  t>0,!£>0  are  (nonrandom)  real  numbers  and  x-t, 
to- is  (a.o. ) then  Q\e  classical  limit  of  P(£t>»),  as  t-»*»,  (e.g. 

[ 1 ] , I *+  ) , [ If  ] ) agree  with  all  the  above-mentioned  forms;  just 

note  that  /3(y)dy  - ^(y-w)dG(y)  - £(y-*)+dG(y) , 3 » 1 - G . 

Examples i 


A.  Let  t be  exponential  as  in  Corollary  1 and,  in  this  first 
example,  let  X also  be  exponential  with  parameter  6^  (EX  - y^  - e^1) . 
Whenever  X has  this  distribution  it  follows  from  (2.3)  that 

Q(y)  - e'01Y  £(9^, 

A 

where  L is  the  Laplaee-Stieltjes  transform  of  L.  Direct  calculation 
then  gives 

/"(l-e“Xy)dQ(y)  - x£(01)/U+9l)  . 
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f 


Sines 

$U)  - e1HCX)/<x+e1) 

we  find  (using  Corollary  1)  that 


P(CT>M) 


yi+(l-fi(X))X 


(A.l) 


where  the  distribution  of  M and  ¥ remain  to  be  specified.  To  note 
the  resemblance  to  (2.12)  just  observe  that  L(61)-P(X>M) , in  this 

example.  Of  course , if  we  let  X*0+  in  (A.l),  we  would  obtain  a 
special  case  of  (2.12). 

Now,  if  we  further  specify  the  distribution  of  ¥ to  be 
exponential  with  parameter  d2,  we  obtain  (from  (A.l)) 


P(Ct>  M) 


yl+  Spi 


£(81) 


(A. 2) 


Finally,  taking  M to  be  exponential  also,  (A. 2)  becomes 


P(5t>  M)  - 


Pi+V 


1T*M 


(A. 3) 


Wl+(e2+x) 

where  uM-EM.  So,  in  this  case,  P(£X>M)  has  the  appearance  of  the 
product  of  two  "availability"  terms. 


If,  instead  of  being  exponential,  M is  taken  to  be  degenerate 
atX,  i.e.,  L(s)«e(s-X),  where  e is  defined  in  (2.1),  it  follows 
from  (A.l)  that 

PU  > M)  - ^ rr  e~^v'l  (A. 4) 

T y1+(l-H(X) ) X x 

with  the  distribution  of  Y unspecified. 

Notice  that  if  X-+0  in  (A. 4)  (or  just  use  (2.12))  the  RHS  of 
(A.  4)  is 


Vy2 


. e-*yl 


(A.  5) 


It  is  the  almost  exclusive  use/misuse  of  this  formula  that 
causes  one  to  produce  the  variations  on  a theme  found  in  this  note. 


2b8 


( 


! 

Foracample,  one  objectionable  use  of  (A. 5)  is  to  specify  X and  f 

*nd  then  to  set  the  expression  in  (A. 5)  equal  to  some  high  number,  i 

each  as  .97,  and  solve  for  y, . (This,  of  course,  is  done  with  no  f 

knowledge  that  the  life  distribution  is  exponential.)  This  y. , call 
it  \iy  i*  then  claimed,  in  advance  development  documents,  to  be  the 
"required”  mean-time-to-failure  of  the  system;  usually  this  is  a 
complex  military  system  which  has  either  never  been  produced  before 
or  one  for  which  we  lack  a substantial  base-line  of  experience  under 
a realistic  mission  profile.  To  make  matters  worse,  this  value  of 
W®  and  a similarly  derived  value,  yf,  obtained  by  setting  (A. 5)  equal  j 
to  some  slightly  smaller  number  such  as  .94,  are  used  as  the  null 
and  alternate  hypotheses,  respectively,  in  a statistical  acceptance 
plan.  Note  that  when  this  so-called  acceptance  plan  is  applied,  it 
will  be  to  a total  population  of  perhaps  one  or  two  systems.  More- 
over, the  system  will  be  constantly  undergoing  design  changes  and 
differing  conditions  of  stress.  Needless  to  say,  such  practices  often 
produce  a reject  signal  from  the  testing  community.  If,  on  the  basis 
of  experience  and  common  sense,  the  systems  under  test  are  judged  to  ! 

do  their  job  reliably,  at  reasonable  coat  and  more  effectively  than  i 

any  system  in  the  arsenal,  these  reject  signals  are  properly  ignored, 
but  often  not  without  the  significant  costs  of  re-tests,  check-tests,  j 
needless  re-design  and  a near  infinity  of  meetings,  briefings  and  ; 

"analyses". 

The  purpose  then  of  the  present  note  is  to  furnish  Army  statis- 
ticians with  two  more  "degrees-of-freedom"  (mission  and  request  time  j 

distributions)  in  numerous  formulas  that  will  aid  him  in  convincing 
the  occasional  naive  practitioner  of  reliability  that  applications 
of  (A. 5)  as  described  above  are  a totally  unrealistic  way  of  setting 
reliability  requirements.  This  can  be  done  by  producing  a wide 
variety  of  answers  with  judicious  choice  of  distributions  for  mission 
and  request  times.  The  variability  obtained  through  distribution 
which  cannot  be  predicted  might  be  enough  to  convince  the  R&D  com- 
munity to  state  reliability  f igures-of-merit  as  goals-to-point-toward 
and  not  hard  requirements  to  be  "demonstrated"  in  some  psuedo-statis- 
tical  test.  The  only  possible  danger  is  that  the  results  stated  in 
this  paper  will  be  misused  in  the  same  way  as  (A. 5) . It  should  be 
emphasized  that  designing  reliable  military  systems  is  of  the  utmost 
importance  and  it  is  not  the  purpose  of  these  remarks  to  argue  other- 
wise. On  the  contrary,  it  is  hoped  that  by  discouraging  an  absurd 
approach  to  setting  reliability  requirements  emphasis  will  be  placed 
on  engineering  reliability  into  new  systems. 

Before  concluding  example  A,  consider  two  additional  distributions 
for  M.  First,  when  M is  uniformly  distributed  over  (0,T): 


P(CT>  M) 


1-e 


eiT 


6 i T 


hj 

Ui+(l+6a)  * 


(A. 6) 
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with  t,  X and  Y axponential  as  above 


Tha  second  is  when  H is  normally  distributed  as  N(y,c) 
conditioned  to  be  positive.  Then 


tt,  (e.c)*  *(J-ce1) 

P(5  > m)  rr  expt-te.Y-  — - — ) -■?  ■' 

T Ui+tfij+X)  1 1 2 ♦ (*) 


(A. 7) 


where  * is  the  d.f.  of  the  standard,  N(0,1),  normal  r.v. 


B.  Because  of  the  ease  of  calculation  we  consider  the  case 
when  JTis  Rayleigh  (a) , 


P(X>s)  - *“»2/2®2, 


and  M is  Rayleigh  (a) . Then 


EXAM  - r g"*1/2®2  . e"*2/2oldB 
o 


,-»V2S*d,  . 


where  8 ■ ao//a*+oa  and  so 

E(X-M)  + - EX-EXAM  - JZ  a (1 ) 

* 2 A*a+oa 


Since  EX>^2  a,  we  therefore  can  write  (2.13)  in  the  form 


- A(0)  (1- 


Another  simple  application  of  Theorem  2 is  obtained  when  X 
is  Gamma  (2,8) t 


P (X>s)  - e“0*  (1+Ss) 


*<">  - ere?  • 'i-  ^77  > 


/o!+o! 


and  M is  the  square  of  a N(0,cr)  r.v..  Then  (after  some  tedious 
calculations) 


A(M)  - A(0) 

(l+2oaB) i/i 


(B  .2) 
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C.  We  now  return  to  Theorem  1 and  ehow  its  relationship  to 
some  Known  results  on  availability , without  using  the  exponential 
assumption  of  Corollary  1,  for  the  request  time  distribution. 


1 «e  • $ 


l*br  this  purpose,  let  the  request  time  be  a fixed  constant, 


t(w)  * T > 0 


for  all  weft.  Then  K(x)  ■ e(x-T),  where  e is  defined  in  (2.1). 
Then  the  function 

*<s,z)  - k(s+z)  - k(s) 

is  equal  to  one  in  the  unbounded  region  defined  by  0 < s < T,  and 
a + z > T and  zero  otherwise. 


It  follows  that  the  RHS  of  (2.4)  is  given  by 

/*K(y)dQ(y)+/V“(K(a+y)-K(s))dQ(y)dU(s) 
0 0 0 


(C.l) 


/ dQ(y)  + / S(T-s)dU(s) 
T 0 


5 ■ 1 - Q. 


For  ease  of  computation,  let  X,  Y and  M be  exponential  with 
U2*®2^  and  respectively.  Then  it  is  easy  to 

show  that 

U(t)  - i(t-  j <l-e"at)) , 
where  u*Uj+U2  end  a-S^+Qj. 

Since,  <5(y)  - a exp(-0  jyJ/Oj+o)  , for  y>0,  we  obtain  ((2.4) 
and  (C.l)) 


P<V  M) 


vi*v» 


i C1+  V2e-Tl 


(C . 2) 


Notice  that  if  T-n-»,  (C.2)  becomes  (A. 3)  with  X«0  as  it 
should.  Also,  If  yM*0  In(C.2)thsn  (23)  of  [12]  Is  obtained. 


) 

i 


Proof  of  Theorem  1.  Proceeding  either  with  a standard  Renewal 
theory  argument  or  directly  from  the  equation  for  the  d.f.  of 
t nonrandom,  in  [ 1 ] p.  4*6  or  [II)  p-'iSH  we  obtain  z 

P(«t>  M)  -“frt)  + Q*U(t),  (2.14) 

where  Q is  given  in  (2.3).  Alternately,  this  is  a special  case 
of  a more  general  (non-identically  distributed)  case  derived  in 
section  4.  Since  K{0)«0  (K-d.f.  of  x) 

/*Q(t)dK(t)  ■ /*K(t)dQ(t) 

0 o 

Consider 

/8*U(t)dK(t)  - /"/*  S(t-y)dU(y)dK(t) 

0 0 0 

■ / / Q(s)dK(s+y)dU (y)  and  observe  that  this  last 

integral,  cail  it  1,  is  finite.  This  follows  from  Ex<+®,  and 
the  well-known  fact  that  U(y)*v<y/u  as  y-*-»,  since  then 

1 < /"*(1-K(y)  )dU  (y)  - /*U(y)dK(y)  - 0(Et)  < +* 

0 0 .1 
(we  have  made  use  of  the  fact  that  U (y)  (l-K(y)J^<  ~ y (l-K(y) ) -*-0 

if  Ex  < +») . Now  w 

/“ff(s)dK(s+y)  - -Q (O)K(y)  + /"l?(s+y)dQ  (s) 

O 0 

so  that 

.M5*U( t)dK(t)  » -/°V*{5(s)d)T(s+y)dU (y) 

O 0 0 

- ^(0)/S8K(y)dU(y)  - /“/°K(s+y)dQ(s)dU(y) 

0 0 0 

- /V#K(y)dU(y)dQ(s)  - /"M<(a+y)dQ(B)dU(s) 

0 0 0 0 

■ / / (K(s+y) -K(y) )dU (y)dQ(s) 
o o 

Proof  of  Theorem  2,  The  sequence  x ♦+•  in  probability  if  given 
e>0,  A>0  there  exists  an  integer  n0>nQ(e,A)  such  that 

P(x  > A)  > 1-e  if  n>n  . 
n — o 
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Lotting  be  the  d.f.  of  Tn,  n^l,  we  can  write 

*■» 


'n 


*UT  > M) 
n 


f S(t)dK_(t)  + / Q*U (t) dK_ (t) 
o n q n 


I(n)  + J (n) 


Now,  let  e>0  be  arbitrary  and  choose  A>0  aueh  that  $(A)-P(x-M>A)<c, 
thfin  f 


o < i(h>-* /A  5(t)dxn(t)  + ;"C(t)dK„(t) 
o n A n 


< Kn(A)  + C{l-Kn(A)) 

< Kn (A)  + c 


ao  that 


0 < limaup  ICn),<  c 
n+®  " 


Therefore,  aince  e X)  is  arbitrary. 


lim  X(n) 

n-H» 


0. 


Por  the  term  J(n),  wa  of  courae  follow  the  uaual  proof  and 
uae  the  Key-Renewal  Theorem.  Thia  places  an  integrability  require- 
ment on  Q which  is  equivalent  (in  our  case)  to  showing  that 

S S(t)dt  < +».  This  follows  from  EX  < +•.  The  assumption  that  T 

la  non-lattice  is  trivial  in  our  application  and  can  be  guaranteed 
by  requiring,  for  example,  either  X or  Y to  have  absolutely 
continuous  d . f . 1 a . 


The  Key-Renewal  Theorem  then  states  that 

Q*U(t)  ■+  i / ${y)dy 

’•  o 

••  t 4 +*  where  y-y^Uj.  in  what  follows,  call  this  limit  B. 


The  argument  for  J is  similar  to  the  one  applied  to  I . 

n n 


That 


is,  by  the  previous  limit,  there  is  some  C such  that  Q*U(t)  is 
within  a preselected  distance  5>0  of  B for  all  t>C«C(5).  Given 
some  e>Q,the  convergence  to  +»  of  xn  is  then  used  to  find  an 

no"no(E'c)  80  P (Tn>C)  £ 1-e  if  n>nQ.  All  this  allows  us 

to  conclude  that  both 


* 
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/ Q*U (t)dK  (t)  > (B-S)P (t  >C)  > (B-5) (1-e)  - B-e' 
c n ~ n — 


and 


/ Q*U(t)dK  (t)  < (B+6) 
c n 

Since,  also, 


i£  n>n^. 
o 


0 < / Q+U(t)dK  (t)  - 0(P(t  < C)  - «(1) 
“ j n n~ 

as  n-*-®,  we  have 


lim  J (n)  - B - ~ /*5(t)dt  . 
* o 


It  remains  to  evaluate  the  integral  of  Q:  (Recall  EX<+») 


00  00 


/ 5(y)dy  ■ f f P (X>y+m)dL (m)dy 
o oo 


■ / / P(X>y+m)dy  dL(m) 
o o 


/"/  P (X>s)ds  dL (m) 
o m 


* / P (X>s ) / dL(m)ds 
0 0 


/ P(X>s)  P (M<s)ds 

0 “ 


■ EX  -/  P(X>s,  M>s)ds 
o 


EX  -/  P(XAM>s)ds 
o 


- EX  - E(XAM) 

where  XAM  ■ minimum  of  X and  M.  Clearly,  0 <,  EX-E(XAM)  < EX  < +®. 


i 1 

A 

•i 
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3.  Additional  Comments  on  the  IIP  Case.  Using  the  stochastic 
model  of  section  2,  the  percentage  of  time,  during  n renewals  of 
the  system,  that  the  system  is  up  and  remains  up  for  a sufficient 
amount  of  time  to  support  *»  mission  of  length  % is  given  by 

ph  “ pn<*>  - J1(Xi"3fr>+/sn  ' <3-l> 

(n>l)  . (Throughout  this  section,  will  be  a strictly  positive 
real  number.)  Assuming  that  ET»EX+EY<+»,  it  follows  from  the  law 
of  large  numbers  and  Slutsky's  Theorem  (cf.  Cramer  [3]  p.  255)  that 

Pn<*)  ♦ J E(X-*)  + , (3.2) 

in  probability  as  n+<»,  y*u1+U2“EX+Ey  • 

Thus,  the  statistic  p (*)  is  a consistent  estimator  of  the 
quantity  E(X-Jt )+/u,  the  nubiquitous  limiting  interval  reliability 
of  [JL]  and  a special  case  of  Corollary  1.  The  simple,  practical 
nature  of  pn(J6)  probably  explains  the  interest  in  describing 

systems  by  means  of  interval  reliability. 

A related  statistic  with  similar  intuitive  appeal  is 
*„(*)  - " (Xi-3fi)+/  ZXL 

Clearly,  this  statistic  gives  the  percentage  of  up-time  that  the 
system  is  available  for  a mission  of  length  Jfi  and  is  a consistent 
estimator  of  the  quantity  ^ (*) *E (X-*) +/u^ . From  Corollary  1 of 

section  2,  this  quantity  is  also  easily  seen  to  be  the  limit  of 
the  probability  that  £T  > *-  given  that  > 0 as  n-*-<»,  when 

T ■*■+“»,  in  probability.  n n 

Using  the  work  of  Skorohod  [ <7  1 Chapter  1,  Sec.  6,  Pyke  l 7 ] , 
Pyke  and  Shorack  [ 8 ] , and  arguments  similar  to  those  in  recent 
work  of  Barlow  and  Proschan  ( 2-  ] , it  can  be  shown  (under  additional 
assumptions)  that  /rf  (^n(*)  - ^<*) ) converges  in  probability  to  a 

normally  distributed  r.v.  , N(0,c  (*■)),  where  the  variance  can  be 
calculated  explicitly,  in  terms  of  \p  (*)  , Var  X,  Var  (X-*-)  , 

and  the  d.f.G.  The  proof  is  outside  the  scope  of  this  note  and 
will  be  reported  elsewhere. 

The  usefulness  of  such  a result  is  that  it  places  emphasis 
on  n ()£■ ) , a directly  measurable  quantity,  rather  than  on  \p(X), 

which  requires  a distributional  assumption. 
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(4.0).  Residual  Life:  non-identlcallv  distributing  case. 

In  this  section  we  suppose  only  that  the  sequences  (X^>  and  {Y^ } 

of  positive  random  variables  are  each  sequences  of  independent 
r.v.'s  and,  further,  that  the  sequence  {X^>  is  independent  of  the 

sequence  (Y^ } . 

Let  be  the  distribution  function  (d.f.)  of  X^,  i>l, 

the  d.f.  of  Y ^ , j>l,  and  sot  F^  equal  to  the  d.f.  of  Ti“x£+Yi»  i>l, 

As  before,  let  M be  a positive  r.v.  with  d.f.  L and  assume  that  M 
and  the  {X^>  and  (Y^)  sequences  are  independent. 

Set  Q^»  d.f.  of  the  r.v.  Z^«  X^-M,  so  that 


Qi(z)  * / Gi(z+y)dL(y) 


(4.1) 


for  all  z in  Rj« (-“,»). 

Finally,  observe  that  since  we  have  not  assumed  that  the 
Tj,  j>l,  are  identically  distributed  r.v.'s,  it  is  possible  for 

the  partial  sums  Sn  of  section  1 to  converge  to  some  proper  r.v. 

in  distribution  (and  hence  with  probability  one  (a.s.)  on  fl)  . 

For  simplicity,  we  want  to  avoid  this  possibility  and  retain  the 
property  of  IID  r.v.'s  which  states  that  Sn-*+®  (a.s.).  Thus, 

when  we  consider  an  instance  where  ?X..  converges,  the  divergence 

of  Sn  to  +»  will  be  guaranteed (even  though  the  Y^'s  are  not 

identically  distributed)  by  assuming  that  ?Y^-*-+®  (a.s.). 

Now,  recall  the  definition  of  £t»  for  non-random  t>0,  given 
in  section  1 and  partition  the  interval  [0,®).by  the  sequence  of 


partial  sums  Sn,  n> 0 . Then 


PUt>  M)  “kz:0P^t>  M'  sk-  < Sk+1) 

«?  /tP(C+.>  M,  t<sfc  ts. -A  )iV(S .<>) 

k®0  o z 1 K 

® t 

•E  j P(Xk+1-M>t-*,  t-KT^JdPfS,,!  *) 


(4.2) 


k-0 


k+1 


^ ^ k 

- E / p(z.  .,>  t-r  )dTT*F.  (X)  + p (z. > t) 
k=l  o k 1 j=i  3 1 
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« . .*  t.  ix  ssswjQS&yfilf 


A 

*3 


where , after  conditioning  on  we  have  used  a familiar  property 

of  conditional  probabilities  (cf.  Krickeberg  [6]  p.  170  problems 
3 and  4) , . the  independence  of  Tk+1  and  and,  for  the  last 


equality,  the  fact  that  the  occurrence  of  the  event  [Zk+1>  t-£] 

implies  the  occurrence  of  the  event  [Tk+1>.  Therefore, 

using  the  d.f.'s  introduced  above  and  the  usual  notation  for  a 
convolution  product t 


TT*F,(t) 
j-1  J 


(F^Fj  • • *Ffe)  (t)  -P  (Sk<  t)  , 


we  can  write  (4.2)  in  the  form 


P(Ct>  M) 


(t)  + I ft. 

1 k-1  K+1 


*5rr*p.(t) 

j-l  3 


(4.3) 


where  ft^-  1-Q^  and  t>0. 

It  is  easy  to  see  that  under  the  assumption  of  the  last 
section  (that  is,  where  the  sequences  are  identically  as  well  as 
independently  distributed) , the  last  equation  reduces  to  equation 
(2.14)  of  section  2. 


Let  the  r.v.  n(M)  be  the  amount  of  time  that  the  random 
function  is  greater  than  M.  If  I is  used  to  denote  the 

indicator  function  of  the  set  of  positive  real  numbers;  that  is, 

1 i.  r > o 

Ky)  - (4.4) 

{ 0,  y < 0 

then  n(M)  can  be  written  as 


% 

« 

'k 

•j 

I 

i5 

i 

-5 


4 

1 


I 


I 

( 

% 


T 

f 


n (M)  - /"iU..-  M)dt 
o z 


(4  .5) 


Of  course,  n(M)  may  be  a defective  r.v.  in  the  sense  that  it 
may  take  the  value  +»  with  positive  probability.  Taking  expectations 
of  both  sides  of  (2.5)  it  is  easy  to  see  that 

En(M)  - /%(£.>  M) dt  (4.6) 

o c 

whether  the  RHS  is  finite  or  not. 
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We  note  in  pasting  that  the  case  when  the  underlying  stochastic 
structure  consists  of  sequences  of  IID  r.v.'s,  the  RHS  of  (4,6)  is 
infinite.  This  fact  might  motivate  one  to  ask  whether  or  not  this 
integral  is  Abel  tumm&ble  to  a finite  value.  That  is,  does 

MX)  » Ae"XtP(Ct>  M)dt 
o * 


converge  as  X*0+?  It  is  amusing  to  recognize  this  integral  as 
P(CT(jl)>  M)  , where  t(X)  is  an  exponentially  distributed  r.v.  and 

apply  Remark  4 or  Theorem  2 of  section  2 to  obtain 

as  X+0+,  if  y<+». 


A(X)^w”1E (X-M)*<+« 


Alternately,  use  only  the  classical  case  with  M random;  then 
an  application  of  the  Dominated  Convergence  Theorem  gives 


A(X)  - /VyP(CyX-i>  M)dym"1E(X-M)  + 

as  X+0+. 

Returning  to  the  non-IID  case  we  can  state  the  following 
Theorem  3 ; If  the  aeries  I E(X  -M)  + <+-  then 


En  (M) 


Z ,E(X  -M) 
V-l  v 


(4.7) 


This  follows  easily.  Just  let  Vn(t)  denote  the  general  term 


in  the  series  (4*3)  and  note  that 
/‘vn(t)dt  - E«n+l-M)  + 


Then  since  the  Vn  are  non-negative  and  integrable  over  [0,®), 


and  the  series  of  integrals  of  the  Vn  converge,  equation  (4.7) 


follows  from  a well-known  result  about  interchanging  summation 
and  integration  (e.g.  page  114,  (2)  [5]).  This  proves  (4.7). 
This  equation  then  shows  that  n(M)  is  a proper  r.v.  and 


n(M)  • E (X  - M) 


V-l 

with  probability  one, 


'i 
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CONFIDENCE  INTERVALS  FOR  A SUM  OF  RENEWAL 
PROCESSES  WITH  APPLICATION  IN  RELIABILITY 


Ronald  L.  Racicot 
Raaaarch  Directorate 
Banat  Waapons  Laboratory 
Watarvllat  Araanal 
Watarvllat,  Nav  York  12189 


ABSTRACT.  In  raliabillty  theory,  tha  tlma  flow  of  falluraa  of  a 
non-conatant  failure  rata  component  which  la  replaced  or  renewed  upon 
failure  forma  a renewal  proceea.  Tha  lnter-ar rival  tlmee  of  falluraa 
in  thla  caaa  are  independent  identically  diatributed  positive  random 
variables.  If  a system  which  le  composed  of  a number  of  such  components 
ie  considered  to  have  failed  if  one  of  its  components  falls,  then  the 
total  number  of  system  failures  is  a sum  of  tiff  individual  renewal 
procassea.  The  problem  considered  in  this  paper  Is  the  computation  of 
confidence  intervals  for  the  total  number  of  system  failures  over  a glvan 
period  of  time  from  total  systam  teats  and/or  individual  component  tests. 
Although  the  application  considered  is  one  from  reliability  theory,  the 
results  are  applicable  to  general  sum*  of  renewal  processes. 

In  solving  this  particular  problem,  the  reliability  engineer  often 
assumes  that  the  sum  of  renewal  processes  asymptotically  approaches  a 
non-homogeneous  Poisson  process  or,  after  a long  period  of  time,  a homo- 
geneous Poisson  process  with  exponentially  distributed  inter-arrival 
failure  times.  For  these  processes,  a chi-square  distribution  can  be 
used  to  determine  confidence  Intervals  for  total  number  of  failures  from 
which  confidenced  reliability  or  MTBF  can  be  determined.  It  can  be  shown, 
however,  that  the  Poisson  process  is  strictly  a local  property  for  sums 
of  renewal  processes  and  that  confidence  intervals  derived  from  these 
assumptions  are  generally  Incorrect.  This  is  shown  by  comparing  the  true 
variance  of  the  number  of  system  failures  with  the  variance  derived  assum- 
ing the  Poisson  process. 

A scheme  for  computing  confidence  intervals  is  presented  in  which 
the  first  3 moments  of  failure  times  of  the  component  processes  are  used 
to  compute  the  mean  and  variance  of  total  system  failures.  For  a large 
number  of  components,  the  normal  distribution  adequately  describes  the 
distribution  of  system  failures  from  which  confidence  intervals  can  be 
estimated. 


Pncalg  pin  Ms* 
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NOTATION. 

f<0 

F(t) 

?<t) 

h(t) 

hjU) 

H(t) 

HjU) 

H<t) 

Htru«^^ 

N(t) 

NjU) 

nc 

nf 

Hjl 


pdf  of  Intar-arrival  timaa  of  failuraa; 
cdf  torreaponding  to  f(t)$ 
i - ?(t) > 

ranaval  rata)  tha  unconditional  pdf  of  component  failure 
and  aubaaquent  ranaval; 

ranaval  rata  for  conponant  J; 

axpactad  valua  of  tha  numbar  of  ayatam  failuraa  ovar  the 
interval  (0,t); 

ranaval.  function  for  component  J ; tha  integral  of  tu  (t) 
ovar  tha  interval  (0,t); 

point  aatimata  ■£  H(t); 

* 

true  valua  of  H(t) ; 

number  of  ayatam  failuraa  ovar  tha  interval  (0,t); 

numbar  of  failuraa  of  component  j ovar  tha  interval  (0, t) ; 

number  of  component*; 

numbar  of  component  failuraa; 

numbar  of  miaaion*  ovar  ayatam  life; 


Pfl(t)  probability  of  N failuraa  in  time  t; 

R(t,T)  reliability  at  time  t for  an  interval  t; 

Rj(t'T)  reliability  of  tha  jth  component; 

Ra(T,nQ)  average  interval-reliability  over  ayatam  life  for  interval 
T and  numbar  of  interval*  1%; 

Ra(t,T)  ayatam  reliability  at  time  t for  an  interval  t; 

R?4(T,nn)  average  ayatam  reliability; 
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t ayatem  tine; 

0 tfalbull  ahape  parameter} 

n Waibull  acala  parameter; 

Uj  mean  inter-arrival  failura  time  foe  eonponant  j s 

P3j  third  eantral  moment  of  inter-arrival  failura  tiaaa 

for  component  J; 

Oj2  varlanca  of  intar-arrival  failura  tiaaa  for  eonponant  Jj 

and 

t interval  or  nlaaion  length  for  which  reliability  ia 

required. 

1.  INTRODUCTION . The  general  problem  la  to  deteraine  confidence 
intervala  for  reliability  of  a aarlaa  ayetea  of  coaponente  from  teat  data. 
Prevloua  aolutione  to  thla  problea  have  been  Halted  to  conetant  failure 
rata  coaponente,  binomial  alaalon  reliability  which,  la  conatant  in  time 
and/or  reliability  for  only  the  flrat  ayetea  failure  [1,2].  The  caae  con- 
aldered  in  thla  paper  which  le  often  of  more  intereet  to  the  reliability 
teat  engineer  involvea  a ayatem  comprlaad  of  mechanical  coaponente  which 
follow  non-conatant  failure  rate  dletrlbutlona.  The  ayatem  la  operated 
contlnuoualy  until  failure  of  any  of  ite  coaponente  occure  at  which  tine 
the  component  la  replaced  or  renewed  and  ayataa  operation  continued. 

For  the  aingle  component  which  la  replaced  or  renewed  upon  failure, 
the  renewal  rate  h(t)  deacrlbea  the  unconditional  failure  rate  of  the 
component  and  la  derlvad  from  tha  underlying  dletrlbutlon  of  inter- 
arrival  failure  tinea  [3,4]: 

H(t)  - f(t)  + ^f(t-x)h(x)dx.  (1) 

The  renewal  rate  ia  dlatlngulahed  here  from  the  haaard  or  conditional 
failure  rate  which  deacrlbea  failura  of  a non-repalrable  item. 


Interval  or 
rate  [5-7]: 

K<t,x) 


t+T 

“ 1 - / h(x)dx  for  aaall  x.  (2b) 

t 


alaalon  reliability  can  be  determined  from  the  renewal 
t+x_ 

■ 1 - / F(t+x-x)h(x)dx  (2a) 

t 


273 


for  practical  applications,  tha  transient  interval-reliability  can 
be  avaraga  over  system  Ufa  to  yield  a single  tine  independent  relia- 
bility index  that  characterizes  a given  component: 

nm 

Ob  i-l 

For  a sariae  eyatem  of  components 


R.(t,T>  - TT  Mt.T) 

J-l  3 

- °m  nc 

*..<W  - - Z TT  Rj 

n_  i-1  j“l 


Tha  elms  flow  of  failures  of  a non-conatant  failure  rate  component 
which  is  replaced  or  ranavad  upon  failure  forma  a renewal  process  [3]. 

Tha  inter-arrival  times  of  failures  in  this  caae  are  independent  iden- 
tically distributed  positive  random  variables.  If  a eyatem  which  is  com- 
posed of  a number  of  such  components  is  considered  to  have  failed  if  one 
of  its  components  fails  (sa/ias  system  assumption),  then  the  total  number 
of  system  failures  ia  a sum  of  the  individual  renewal  processes.  Tha 
problem  considered  here  ia  the  computation  of  confidence  intervals  for 
tha  total  number  of  system  failures  over  a given  period  of  time  from 
total  system  teste  and/or  individual  component  tests.  Although  the  ap- 
plication considered  is  one  from  reliability  theory,  the  results  are 
applicable  to  general  sums  of  renewal  processes. 

Many  properties  of  renewal  processes  and  sums  of  renewal  processes 
are  covered  in  the  literature;  so  only  the  final  results  era  summarised 
here  [3-7].  If  Nj (t)  represents  the  total  number  of  failures  of  com- 
ponent j over  time  interval  (0,t)  then  for  the  system 


i 


N(t)  - I N. (t)  • 

J-l  3 

For  components  which  fail  independently  of  one  another,  the  mean  and 
variance  of  N(t)  is  equal  to  the  sum  of  tha  mean  and  variance  of  the 
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component  processes t 


nc 

H(t)  - E{N(t)}  - l H, (t) 
J-l  3 

(7) 

dH(t)  nc 

h(t)  - • l h,(t) 

dt  j-1  J 

(8) 

nc 

Vsr(N(t))  • l Var{N  (t)}. 

J-1  3 

(9) 

For  small  mission  time  interval  t and  a large  numbar  of  components, 
tha  average  rallablllty  (3)  can  be  shown  to  asymptotically  approach  the 
following  valua  [3]t 

*sa<T»nm>  * 1 - H^T). 

(10) 

In  rallablllty  applications  than*  whara  the  abova  assumptions  hold,  It 
suffices  to  dsal  with  H(t)  for  tha  ayatam  with  rallablllty  balng  deter- 
mlnad  from  (10). 

2 . NON-HOMOOENBOUS  POISSON  PROCESS  AS  AM  APPROXIMATION  TO  N(t). 

In  conaldarlng  the  problem  of  non-constant  failure  rata  components,  tha 
rallablllty  anglnaar  of tan  asauaas  that  tha  sum  of  ranawal  procasaaa 
asymptotically  approachaa  a non-homoganaoua  Polaaon  procasa  (NHPP)  with 
lncraaalng  numbar  of  componanta  or,  aftar  a long  parlod  of  time,  a homo- 
ganaoua  Polaaon  procaaa  (HPP)  with  axponantlally  dlatrlbutad  lntar- 
arrlval  failure  tlmaa  [3].  For  thaaa  processes,  tha  chl-squara  distribu- 
tion can  bo  ussd  to  determine  confidence  intervals  for  total  numbar  of 
failures  from  which  confldancad  rallablllty  or  MTBF  (mean-tlma-batween- 
f allures)  can  be  determined.  In  what  follows,  however,  it  la  readily 
shown  that  tha  Polaaon  process  is  strictly  a local  property  for  sums  of 
ranawal  processes  and  that  tha  global  confidanca  Intervals  darlvad  from 
these  aaaumptlons  are  generally  Incorrect. 

Tha  distribution  of  numbar  of  failures  for  tha  NHPP  is  given  as 


V /a  \ ■ 

“lEi!  --H(t) 

/II  \ 

pH(t) 

Nl 

U*/ 

E{N(t)> 

- H(t) 

(12) 
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V»r{H(t>>  - H(t) 


C 


It  suffices  to  show  that  tha  trua  vartanca  of  tha  sub  of  renewal 
proeasaaa  doaa  not  generally  aqual  H(t)  aa  shown  by  (13) . Consider* 
for  example*  tha  asyaptotlc  ranawal  proeass  for  larga  t in  which  tha 
naan  and  varianea  for  conponant  J ara  givan  by  [3] 


MO  , t 

lia  -i « — t H.  (t)“  — 

t-H*  t yj  3 Pj 

°\2 

Var{N. (t))  “ t 
1 Vj3 

2 

in  which  Uj  and  o ara  tha  naan  and  varianea  of  tha  intar-arrival 

fsliura  tlmas.  Using  (7)  and  (9)  givas 

°c  , 

H(t)  « I - 
J-l  Mj 

nc  0 

Var(N(t)}  * J 

J-l  Uj3 


In  general,  H(t)  4 Var{N(t)}  and  tha  sub  of  ranawal  procaasas  for  this 
axanpla  doaa  not  approach  a NHPP  or  HPP  in  a global  aansa  no  nattar  how 

larga  nc  bacoaas.  For  aqual  components*  for  axanpla*  1/u  4 o2/y3  unlasa 

o2*y2.  This  is  tha  caaa  for  tha  axponantlal  distribution  but  is  only  a 
apacial  caaa  for  othar  diatributlons.  Although  tha  asynptotic  procaas 
for  larga  t waa  considered*  the  sane  can  be  shown  for  tha  sun  of 
ordinary  ranawal  procassas. 

3.  CONFIDENCE  INTERVALS  US IHC  COMPONENT  MOMENTS . Since  tha  sum 
of  ranawal  proeasaaa  (6j»ia  a sun  of  discrete*  lattice  type  randon 
variables,  it  asynptotically  approaches  tha  nornal  distribution  as  an 
envelops  with  increasing  number  of  coaponants  [8].  Confidence  intervals 
than  can  be  estimated  for  H(t)  using  nornal  tables  for  larga  number  of 
coaponants  with  8(t)  and  its  variance  being  determined  fron  test  data. 


1 


I (flft-  - ' J. 


. . »}!(££ 


As  will  b«  shown  later,  «n  oxer*  fsllurs  should  bs  sddsd  to  g(t)  In 
determining  uppsr  confldancs  limits  to  rsmovs  bits. 

Ths  rsnswsl  function  for  coaponsnt  j can  ba  astlmatsd  from  tha 
monants  of  tha  inter-arrival  timaa  of  avanta  for  larga  t [3J. 


Hjo(t)  * — + 
Mi 


qj2-^J2 


+ 0<l/t) 


V.r(»Jo<t))  - + Wt) 


Vj3  U *Ul*  3MjS 


for  tha  ordinary  ranawal  procaaa  and 

Hj-ft)  “ -*■ 

2 Mi 


Var{NJ#(t)}  « -Jj-  + ( J 


- H2L  > + 


0(l/t) 


for  tha  aqulllbrlum  ranawal  procaaa.  In  ths  ordinary  ranawal  procass 
all  componanta  ara  naw  at  t-0.  Tha  aqulllbrlum  procaaa,  on  tha  other 
hand,  la  ona  which  has  baan  running  for  a long  tins  bafora  It  la  first 
observed  (aaa  Cox  [3],  Chapter  2 for  mors  detailed  description  of  theaa 
processes). 

Casa  1:  Comp lata  Samples  with  larga  t 

For  this  case  the  momenta  can  ba  estimated  without  making  any  assump- 
tion about  tha  underlying  distribution: 


. “M 

*i  • V”« 


(22a) 


(22b) 


- nfj 

A n 

v3j  " nf j 1 
J J i-1 

(Xj  j-yj ) / (nf j-1) (nf j-2) 

(22c) 

Var{Hj (t) } - 

Var{Nj(t>}/nfj 

(23) 

in  which  Xi<(  i"l Afi  arc  n*.,  failure  times  for  component  j.  Sub- 

stituting T22)  into  (18),J(19)  and  (23)  or  (20),  (21)  and  (23)  yields 

A A A 

component  estimates  for  Hj(t)  and  Var{Hj(t)}.  System  H(t)  and  its  var- 
iance can  then  be  determined  from  (7)  and  (9)  from  which  confidence 
limits  on  the  true  value  of  H(t)  can  be  estimated  using  normal  tables. 

Case  2r  Censored  Samples 

For  this  case,  a theoretical  distribution  for  inter-arrival  failure 
times  must  be  assumed,  such  as  the  Ualbull  or  gamma,  with  the  moments 
being  estimated,  for  example,  using  maximum  likelihood.  Confidence 
limits  can  then  be  determined  assuming  the  normal  distribution  for  total 
number  of  pooled  failures. 

4.  SOME  NUMERICAL  RESULTS  FOR  CASE  1 


A particular  example  has  been  considered  to  study  the  frequency 
exactness  of  the  confidence  limits  described  above.  For  this  study 
Monte  Carlo  simulation  is  used  to  artificially  generate  sample  out comae 
for  a system  with  given  component  parameters'.  The  system  is  assumed  to 
be  composed  of  nc  identical  Weibull  components  with  parameters  n and  £. 

Using  these  parameters,  failure  times  for  a given  number  of  failures  are 
generated  for  each  component  using  random  numbers  with  the  quantities 

A A A A .S 

Uj,  Oj  and  Ujj  being  computed  from  (22).  From  these  Hj(t)  and 

A 

Var{Nj (t) ) are  computed  using  (18)  and  (19)  where  large  t is  assumed. 

A A 

Estimates  for  the  system  H(t)  and  Var  (H(t))  are  then  determined  from  (7), 
(9)  and  (23). 

Assuming  the  normal  distribution  for  H(t),  confidence  limits  on 
H(t)  can  be  determined  from  the  given  set  of  sample  outcomes.  This  is 
repeated  1000  times  for  a fixed  set  of  parameters.  The  normal  cdf, 
geuf  (H(t)),  is  evaluated  at  the  true  and  known  value  of  H(t)  for  each 
of  these  sample  outcomes.  For  exact  frequency  confidence  intervals, 
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the  function  gauf  (Htru,(t))  should  bt  uniform  on  (0,1.0).  Results 

indicate  that  although  the  confidence  limits  are  not  exact,  they  are 
close  enough  for  practical  purposes. 

Table  I lists  some  of  the  results  of  these  trials  for  the  upper 
90%  confidence  limit  on  H(t)  (lower  90%  confidence  limit  on  average 
reliability).  An  extra  failure  had  to  be  added  to  the  total  number  of 
system  failures  to  remove  bias.  Tor  exactness,  the  percent  of  trials 
in  which  Htrue  1*  greater  than  the  upper  90%  confidence  limit, 

A 

Hpo*  should  be  10%.  As  can  be  seen  from  the  results  In  Table  I,  the 
confidence  limits  are  close  to  this  requirement.  The  confidence  limit 

A 

H9Q,  therefore,  is  judged  to  be  exact  for  this  case  as  long  as  one  extra 

failure  is  added  to  total  number  of  test  failures. 

The  main  limitations  of  the  above  approach  are  the  requirement  for 
long  system  times  and  large  number  of  components  and/or  failures  for 
exactness.  Also,  in  computing  reliability  from  H(t),  small  mission  times 
(high  reliability)  are  required  for  the  approximation  (10).  The  computa- 
tional methods  Involved,  however,  are  relatively  straightforward  and  the 
approach  appears  to  be  a sound  one. 
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TABLE  I 

RESULTS  OF  MONTE  CARLO  TRIALS  TO  STUDY  UPPER  90% 
CONFIDENCE  LIMIT  FOR  SUM  OF  RENEWAL  PROCESSES 

NUMBER  OF  NUMBER  OF 
COMPONENTS  FAILURES  PER  H (t-5) 

COMPONENT  U 


% OF  TRIALS 
Htrue  > »90  * 


* 


10 

10 

51.7 

9.8 

10 

5 

51.7 

10.6 

5 

5 

25.8 

9.6 

2 

5 

10.3 

7.1 

A 

2. 

90  PERCENTILE  OF  DISTRIBUTION  GAUF  (H+l,  ffg) 

WEI BULL  COMPONENT  PARAMETERS:  n “ 1.0,  0 - 3.0 


i 
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DETECTING  an  unknown  signal  in  a multiple  object,  telemetry  situation 


John  Bart  Wilburn.  Jr. 
Instruaentatlon  and  Methodology  Branch 
US  Army  Electronic  Proving  Ground 
Ft.  Huachuca.  AZ  85613 


ABSTRACT.  The  problem  la  that  of  detecting  anomalie  patterns  in 
environmental  grid  data  approximately  coincident  with  a point  stimulus 
in  the  region  including  all  data  sources. 

The  particular  case  involved  is  to  replace  the  current,  rather  awk- 
ward,  technique  with  a more  concise  and  efficient  alogorithm  for  detect- 
ing anomalous  growth  patterns  of  tree-ring  chronologies  approximately 
coincident  with  volcanic  eruptions. 

STATEMENT  OF  THE  PROBLEM;  The  problem  I am  presenting  here  is  a 
problem  arising  in  my  climatology  research  on  estimating  climatic  anoma- 
lies following  volcanic  eruptions.  People  have  long  suspected  that  such 
anomalies  would  occur.  (Franklin.  1783  Diary)  It  seems  as  no  surprise 
to  most  people  that  something  as  majestic  as  a volcano  should  perturb 
climate  and  yet  compelling  evidence  has  not  been  found,  probably  due  to 
the  short  length  of  meteorological  data  records  available  and/or  improper 
methods  of  analysis. 

I am  estimating  these  climatic  anomalies  by  computing  a regression 
model  for  climatic  variables  such  as  seasonal  temperature  and  precipi- 
tation averages  based  on  tree-ring  chronologies.  In  this  way  I am  hoping 
to  attach  to  a much  longer  record  of  data.  The  regression  model  is  a 
principal  component  regression  calculation  which  I discussed  at  this  con- 
ference last  year;  and  uses  continuous  tree-ring  chronologies  and  a con- 
current meteorological  record  caken  at.  or  near,  the  tree  site  for  which 
the  model  is  computed.  That  is,  for  each  tree  site  there  is  one  model 
for  each  climatic  variable  for  each  season. 

With  these  models,  or  transfer  functions,  I estimate  the  climatic 
anomalies  following  volcanic  eruptions  by  applying  anomalous  sequences 
of  annual  tree  growth  rings  following  those  eruptions  as  input  to  the 
transfer  function. 

The  problem  I am  presenting  here  is  how  to  improve  the  accuracy  of 
the  detection  of  anomalous  tree  growth  due  - probably  - to  volcanic 
activity  and  to  perform  the  detection  more  economically. 

This  may  not  seem  related  to  telemetry  in  the  usual  sense;  however, 

1 contend  that  it  is,  or  has  within  it,  a problem  in  multiple  object 
telemetry.  In  this  case,  telemetry  is  interpreted  as  the  receipt  of  a 
signal  transmitted  by  a sensor  operating  in  an  environment  wherein  the 
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signal  Is  supposed  to  contain  information  about  it's  environment. 

In  ay  case,  the  sensor  is  the  tree.  The  signal  is  the  chronology  of 
it's  annual  growth  rings.  These  growth  ringB  differ  in  width  in  response 
to  climatic  conditions  present  at  the  site.  Figure  1 illustrates  a section 
of  a chronology  and  a graph  of  the  ring  widths.  As  one  can  see,  this  signal 
looks  very  ouch  like  many  other  kinds  of  signals  one  may  encounter  in  a 
telemetry  operation. 

The  signal  is  supposed  to  contain  information  about  the  climatic  con- 
ditions at  the  tree  site  during  the  time  that  the  growth  ring  was  influ- 
enced. A considerable  amount  of  work  done,  and  currently  underway,  at 
the  Laboratory  of  Tree  Ring  Research  at  the  University  or  Arizona  supports 
this  supposition.  The  problem  is  that  not  all  tree  ring  chronologies  are 
indicative  of  climate.  Only  sensitive  trees  have  chronologies  which  re- 
flect their  past  climate  and  then  only  when  properly  interpreted. 

There  are  many  factors  which  influence  a tree's  response  to  a partic- 
ular climatic  variable.  Topography  is  the  primary  class  of  these  factors 
which  include:  water  runoff,  exposure  (north  or  shady  side  versus  south 

or  sunny  side),  altitude  (growth  season),  subsurface  conditions  influenc- 
ing root  structures,  availability  of  ground  water  and  density  of  tree 
growth.  However,  these  factors  are,  for  the  most  part,  reasonably  con- 
stant over  the  time  period  considered;  that  is,  a few  hundred  years.  Thus, 
the  sensitivity  of  a tree  to  climatic  change  can  be  considered  to  be  reason- 
ably constant  except  when  it  is  obviously  not  true  as  in  cases  such  as  fire, 
earthquake,  etc.  Figure  II  illustrates  these  opposite  conditions,  compla- 
cent and  sensitive  trees,  as  a function  of  topography. 

A sample  illustration  of  this  sensitivity  is  shown  when  we  consider 
a tree  which  is  living  in  an  abundant  environment  (as  seen  by  the  tree) 
with  a surplus  of  water.  This  tree  would  have  a "complacent"  ring  series 
because  such  a tree  will  not  suffer  much,  if  at  all,  during  a relatively 
dry  growing  season  wiin  less,  but  still  adequate,  precipitation.  However, 
a farmer  in  the  same  area  with  a crop  tuned  to  the  normal  precipitation 
(abundant  from  the  tree's  point  of  view)  might  consider  that  dry  spell  a 
near  disaster.  This  complacency  is  compounded  when  one  notes  that  most 
trees  tend  to  integrate  over  several  years  with  the  emphasis  placed  on 
the  climate  of  the  year  preceding  the  current  growing  season. 

The  point  is  that  one  may  see  that  a given  species  of  tree  may  have 
many  different  responses  tc  highly  similar  climates,  depending  on  the 
specific  locations  of  the  trees  and  the  conditions  preceding  the  current 
growing  season  of  up  to  three  years. 

Now  it  is  possible  to  see  the  nature  of  the  problem  I am  addressing. 

As  shown  in  Fig.  Ill,  I have  selected,  as  sensors,  ten  tree  sites;  all 
Douglas  Fir  and  all  with  fairly  high  variance  in  the  chronology  as  an 
indication  of  sensitivity.  These  ten  tree  sites,  indicated  by  the  dots, 
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constitute  a grid  of  climatic  sensors,  each  of  which  haa  a response 
function  defined  only  for  it's  own  location,  but,  which  haa  been  assumed 
to  be  reasonably  tine  invar lent. 
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Mow  the  problem  becomes  somewhat  more  complicated.  This  is  because 
I am  looking  for  the  result  of  an  unknown,  but  probably  different  response 
function  to  the  output  from  another  response  function,  which  is  the  atmos- 
phere,  also  unknown  and  responding  to  a point  stimulus  (the  volcanic  erup- 
tion) . It  is  the  nature  of  this  atmospheric  response  function  that  I would 
like  to  eventually  learn  something  about  from  the  regression-based  esti- 
mates of  the  climatic  anomalies  mentioned  earlier. 

The  response  of  the  atmosphere  to  this  stimulus  at  some  location  on 
the  earth  is,  most  likely,  some  function  of:  the  type  of  stimulus;  that 

is  large,  small,  duration,  etc;  the  location  of  the  tree  site  (sensor); 
the  time  lag  from  the  eruption;  the  time  of  the  year  and  the  initial  con- 
ditions at  the  time  of  the  year. 

The  response  function  of  the  trees  to  the  atmospheric  (climatic)  con- 
ditions is  some  function  of:  the  season;  it's  own  serial  correlation; 

it's  initial  condition  and  it's  location  (topography).  The  response 
function  of  the  trees  omits  the  physiological  variables  as  I am  consider- 
ing them  as  explicit  since  I am  not  modeling  the  tree  growth. 

The  first  part  of  the  project,  which  is  the  subject  of  this  paper, 
was  to  detect  the  anomalous , Indirect  response,  if  any  exists,  of  the 
trees  to  volcanic  eruptions.  To  date,  the  method  of  detecting  these  pos- 
sible anomalous  sequences  of  growth  rings,  or  anomalous  signals,  has  been 
as  follows:  First,  I considered  only  one  site  at  a time;  thereby  permit- 

ting me  to  ignore  all  parameters  relating  to  location.  Second,  the  tree 
integrates  over  all  seasons;  so,  for  ths  purposes  of  signal  detection,  I 
must  ignore  season.  Now  then,  it  must  be  remarked  that  the  amount  of 
change  in  the  tree's  variance  due  to  volcanic  activity  may  be  only  a 
very  small  portion  of  the  total  variance  in  the  tree  ring  chronology. 

Assuming  that  the  chronology  is  a weakly,  stationery,  random  serlea, 
a kind  of  signal  averaging  was  accomplished  to  detect  a possible  average, 
or  typical,  response  signal  of  the  tree  to  specific  "types"  of  volcanic 
eruptions. 

The  tree  ring  data  were  formed  into  a lagged  array,  as  shown  in  Fig. 

IV,  wherein  the  lag  is  fourteen  yearB . The  lag  is  more  than  sufficient 
to  accommodate  the  serial  correlation  of  ubout  three  years  and  is  guessed 
to  be  sufficient  time  to  cover  any  lag  of  the  propagation  of  the  atmos- 
pheric phenomena.  This  lag  also  side-steps  two  favorite  cycles:  lunar 

and  so lar . 

The  data  in  an  array  s <ch  as  shown  in  Fig.  IV  contains  all  of  the 
data  and  as  such  is  referred  to  as:  Dtnm,  the  total  ring  array.  A 
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similar  array  i a formed  from  cha  columns  of  D*  such  that  tha  data  of  tha 
growth  ring  index  (percent  of  noraal  growth)  in  tha  first  row  of  each 
column  is  tha  data  of  a volcanic  eruption  of  a specified  class  of  erup- 
tions paraaatarlsad  by  sisa  of  eruption  and  tha  region  of  the  earth  con- 
taining the  volcano.  This  data  array  is  referred  to  as  the  signal  array 

and  ia  denoted  byt  D*  . 

nq 

A third  array  is  the  background  array,  D*5 ; and  is  the  direct  subtrac- 
tion of  D*  froa  D*:  Jp  - Dc  0 D*. 

Mow  then,  the  row  averages  of  each  of  these  arrays  were  computed, 
lhase  constitute  average  growth  curves  of  the  tree  for  a fourteen-year 
period  under » noraal  conditions,  conditions  coincident  with  volcanic 
activity  of  the  class  specified,  and  under  conditions  excluding  those 
concurrent  with  that  specific  class  of  volcanic  activity. 

A CHI-squara  comparison  was  Bade  with  the  following  hypotheses: 

1.  That  the  average  growth  curve  of  the  signal  array,  D*,  was  indis- 
tinguishable froa  the  average  growth  curve  of  the  total  array,  D*. 

2.  That  tha  average  growth  curve  of  the  signal  array,  D*,  was  indis- 
tinguishable froa  that  of  the  background  array,  Db, 

3.  That  the  average  growth  curve  of  the  background  array,  D**,  was 
distinguishable  froa  that  of  the  total  array,  Dc. 

4.  That  the  average  growth  curve  of  the  total  array,  Dc,  was  disting- 
uishable froa  the  flat  curve  of  the  average  of  the  total  chronology. 

If  all  of  these  hypotheses  art1  rejected,  then  the  average  growth  curve 
of  that  signal  array  is  considered  a probable,  valid  response  to  a volcanic 
eruption  of  the  class  specified.  Fvoa  about  300  cases,  35  passed  this  test 
at  the  99Z  confidence  level. 

A second  test  was  devised  Involving  the  coapariaon  of  the  first  eigen- 
vectors of  the  variancs/co-variance  matrix  of  the  ring  signal  array,  D*, 
coaputed  two  ways.  The  varlance/co-varlance  aatrices  of  the  signal  array 
were  coaputed:  (1)  using  the  row  averages  of  the  total  ring  array,  dit  as 

the  aaan;  and  (2)  using  the  row  averages  of  the  ring  signal  array, 

Da,  in  the  uaual  fashion.  Thus  we  have: 

0%  (3c> . ;~h:  i (d»tj  - 5't>  «*tJ  - i 

and 

c*™  I H'ij  - 3V  <d'1J  - 3V  j 
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Then  extract  the  eigenvectors: 

ce  (dfc)  c <dc)  . E (d*)  * (dfc) 
C nn  Enn  Enn  -'y^rui 


end 


C*  (d*)  . (d*)  - 

nn  nn 


Next,  conpere  E (•'  ) end  E>. ’d*) , if  they  are  aignlflcantly  different, 
in 

then  the  Array  0*  la  usable  aa  an  array  of  tree  ring  data  comprised  of  sig- 
nificant responses.  This  was  a very  stringent  test  and  out  of  the  33  can- 
didates, only  six  passed. 

The  conputer  time  required  to  perform  all  of  these  tests,  for  all  ten 
sites  and  thirty  classes  of  volcanic  eruptions,  was  about  ten  hours  on  a 
CDC  630Q.  This  did  not  include  the  comparison  of  the  eigenvectors,  but 
only  their  computation.  Thus,  the  need  for  a new  method. 

Another,  related,  reason  for  initiating  this  work  is  to  begin  the 
development  of  a statistical  description  of  tree  growth  which  will  contain 
information  about  both  the  spatial  relationships  of  the  tree  sites;  and, 
simultaneously,  the  temporal  behavior  of  the  individual  tree  sites  and  the 
interrelationship  between  the  two  descriptions  of  the  tree  growth. 

One  of  the  approaches  to  this  problem  I have  started  is  to  devise  an 
entropy  function  for  each  column  of  the  total  array. 

"V  ? Plli  U*  P‘h 


1 ■ tree  site  location 
1 ■ row 
j “ column 

where  is  computed  using  the  statistics  of  the  chronology. 

The  intent  was  to  detect  a departure  from  normal  growth  during  the  four- 
teen year  period  following  any  year.  The  data  array,  Dc,  would  then  be 
collapsed  into  a one  dimensional  sequence  of  entropy  values  for  each  tree 
site.  These  datn  streams  could  then  be  considered  as  variables  indexed 
by  location  and  analysed  by  multivariate  techniques  for  the  time  invariant 
i relationship  of  the  time  lagged  behavior  between  each  site.  Furthermore, 
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by  computing  a conditional  antropy,  tha  serial  correlation  of  the  traea 
could  be  accounted  for. 

In  this  way i it  is  hoped  that  thoae  tree  sites  with  large  and/or 
correlated  variance  of  abnormal  behavior  will  be  selected  by  eigenvector 
analysis . 

Another  variation  of  this  method  would  be  to  form  a lagged  array  from 
one  of  the  principal  components  of  a spatial  array  of  tree  ring  chronol- 
ogies sampling  an  entire  region.  Than,  to  perform  the  entropy  calculation 
of  that  lagged  array.  This  would  highlight  abnormal  growth  occurring 
simultaneously  throughout  the  region. 

Now,  1 would  like  to  hear  any  comments  and  suggestions  the  panel  might 
wish  to  make. 
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Fig.  Ill 

Distribution  of  Tun  Tret*  Situs 
(Sensors)  in  North  America 
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OUTLIER  DETECTION  PROCEDURES  IN 
TRAJECTORY  DATA  REDUCTION 


t 


Killian  S.  Agee  and  Robert  H.  Turner  ' 

Analysis  and  Computation  Division  ! 

National  Range  Operations  Directorate 
US  Amy  White  Sands  Missile  Range  ; 

White  Sands  Missile  Range,  New  Mexico  i 

i 

ABSTRACT.  Outlier  detection  procedures  are  used  extensively  in  tra- 
jectory  data  reduction  at  White  Sands  Missile  Range  (WSMR) . There  are 
three  distinct  circumstances  in  which  outlier  detection  procedures  are 
used  in  trajectory  data  reduction.  These  are  recursive  filtering, 
weighted  least  squares  batch  processing  of  trajectory  measurements,  and 
unweighted  least  squares  processing.  Each  of  these  processes  use  a 

different  outlier  detection  procedure.  This  paper  describes  the  use  of  ■ 

outlier  detection  procedures  at  WSMR,  the  specific  procedures  used  in  the  I 

various  data  reduction  processes,  and  the  limits  within  which  each  of  the  j 

procedures  performs  satisfactorily.  Of  prime  concern  are  the  situations  | 

in  which  the  outlier  detection  procedures  fail  to  detect  some  obvious 
outliers.  These  undetected  outliers  destroy  automated  data  reduction 

procedures  causing  a significant  number  of  reruns  with  human  detection  j 

of  these  outliers.  The  performance  of  various  outlier  detection  proced- 
ures, those  currently  used  at  WSMR  and  some  others  is  shown  on  typical  j 

data  sets  for  which  the  procedures  fail.  It  is  hoped  that,  in  addition 
to  obtaining  some  suggestions  on  improving  outlier  detection  used  in 
WSMR  data  reduction,  this  presentation  will  stimulate  further  investiga- 
tion into  outlier  detection  methods  by  Army  researchers. 

1.  INTRODUCTION . Some  outlier  detection  techniques  for  batch  and 
recursive  processors  which  produce  trajectory  estimates  from  instrumenta- 
tion measurements  are  described. 

Although  there  are  some  outlier  detectors  in  the  batch  processor,  a 
pre-processor  is  necessary  to  eliminate  those  outliers  which  could  ruin 
the  batch  process  beyond  recovery.  This  pre-processor  removes  the  trend 
using  an  unweighted  least  squares  process  and  detects  outliers  using  two 
tests.  A better  way  of  removing  the  trend  is  necessary  when  some  types 
of  outliers  are  present.  Also,  since  some  types  of  outliers  produce  a 
masking  effect  which  makes  sequential  procedures  insensitive,  other  tests 
are  needed.  The  outlier  detectors  are  good  in  the  batch  processor  and 
very  good  in  the  recursive  processor. 

2.  PRE-PROCESSOR 

a.  Process . Small  samples  (one  to  four  seconds)  of  10  to  50  measure- 
ments of  each  observation  are  fit  to  a second  degree  polynomial  in  time 
using  unweighted  l*ast  squares. 
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The  observation  model  is 


*0  * Vi  ♦ 


.2 

*2*1  * 


i ■ 1,  n 


or 


Z ■ TA  ♦ e 

2 

where  e is  random  noise  with  zero  mean  and  o variance. 
T 

Minimi  tint  c t with  respect  to  A we  have 

A ■ (TTT)-1TTZ 
and  the  set  of  residuals 


r - Z - TA 

b.  Outlier  Detection.  Sample  skewness  and  kurtosis  coefficients  are 
comput ed  Iron  the  residual s 


n A n , - 
b2  - (n-3)  l rJ/(  J r?)2 
1 i-1  1 i*l  1 

.If  either  or  b2  exceed  their  respective  5%  significance  level  critical 

values,  the  observation  corresponding  to  the  largest  residual  is  deleted 
and  the  entire  process  is  repeated  with  the  remaining  observations. 

We  hope  that  this  initial  process  will  detect  most  of  the  outliers 
automatically  with  as  little  human  intervention  as  possible  and  a mini- 
mur  of  false  alarms.  When  there  are  too  many  outliers  or  a few  large 
ones  it  is  almost  impossible  to  detect  them.  In  these  cases,  if  the 
presence  of  an  outlier  is  detected,  vhe  good  observations  adjacent  to  the 
outliers  are  the  ones  rejected. 

c.  Examples.  These  two  .samples  show  that  the  presence  of  outliers 
can  sometimes  distort  a curve  fit  so  much  that  outliers  cannot  be  detect- 
ed. Furthermore,  if  the  presence  of  outliers  were  detected,  sometimes 
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the  good  observations  arc  rejected  while  the  outliers  r attain.  Bach  sample 
has  three  obvious  outliers  which  were  not  detected  from  the  first  set  of 
residuals. 

(1)  Example  1.  Assume  some  other  test  could  detect  the  presence  of 
outliers  and  that  the  observation  with  the  largest  residual  was  rejected. 
One  of  the  outliers  would  be  rejected.  The  two  previously  described  tests 
and  rejection  criteria  would  now  sequentially  detect  and  reject  the  two 
remaining  outliers. 


(2)  Data  for  Example  1. 


Obs 

Res(1) 

Res (2) 

Ras(3) 

Res (4) 

.21709 

-.33222 

-.29484 

-.20135 

-.00001 

.21824 

-.31419 

-.26636 

-.17482 

.00001 

.95519 

.44164 

.49745 

.58588 

.94511 

.45245 

.51376 

.93499 

.46522 

.22288 

-.22199 

-.15714 

-.08487 

.00001 

.22405 

-.19391 

-.13101 

-.06642 

-.00002 

.22530 

-.16375 

-.10528 

-.04951 

.00002 

.22652 

-.13161 

-.08006 

-.03424 

.00002 

.22770 

-.09751 

-.05535 

-.02063 

-.00004 

.22900 

-.06128 

-.03100 

-.00852 

.00000 

.23028 

-.02307 

-.00715 

.00195 

.00001 

.23155 

-.01714 

.01622 

.01079 

-.00001 

.23286 

.05940 

.03915 

.01805 

.00000 

.23418 

.10367 

.06162 

.02370 

.00001 

(3)  Example  2.  Again  assume  that  some  other  test  could  detect  the 
presence  of"outliers  and  that  the  observation  with  the  largest  residual 
was  rejected.  The  first  point  rejected  would  b«  the  good  observation  in- 
between  the  outliers.  Two  outliers  would  be  the  next  to  go.  Further 
application  would  reject  good  observations  and  never  get  the  one  re- 
maining outlier.  The  outlier  detectors  previously  described  don't  indi- 
cate the  presence  of  outliers  'in  any  set  of  residuals. 


t 


293 


1 TtBl  tirear f iiiir<lmi  i ii  MV  lift  HiiitflWlI 
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Obs 

Res(l) 

Res (2) 

Res (3) 

Res (4) 

-1.70987 

-.15777 

-.28786 

-.36369 

-.37731 

-1.70942 

-.00020 

-.03242 

-.08045 

-.10634 

-1.70893 

.10548 

.14636 

-.12669 

.09700 

-1.70845 

.15923 

.24843 

.25767 

.23267 

-1.70793 

.16109 

.27383 

.31254 

.30071 

-1.70741 

.11102 

.22252 

.29127 

.30108 

-1.70682 

.00910 

.09458 

.19393 

.23385 

-1.70626 

-.14478 

-.11009 

.02041 

.09892 

-1.70571 

-.35060 

-.39148 

-.22927 

-.10368 

-1.70510 

-.60828 

-.74951 

-.55502 

-.37389 

-1.70449 

-.91788 

-1.18425 

-.95693 

-.71177 

1.43777 

1.86223 

1.44596 

1.44602 

1.45641 

.86545 

1.16012 

-1.70257 

-2.15818 

1.44667 

.47314 

-.54153 

-.17727 

.40876 

d.  Conclusion.  More  work  needs  to  be  done  in- 

(1)  Removing  trends  in  the  presence  of  outliers. 

(2)  Determining  whether  the  testing  arid  rejection  of  small  subsets 

of  observations  as  a one  time  process  is  more  effective  than  the  sequential 
application  of  testing  and  rejecting  of  one  observation  at  a time. 

3.  BATCH  PROCESSOR 

a.  Process.  This  is  a weighted  least  squares  process  which  uses 
observation  variances  as  weights.  It  produces  all  position  vector  esti- 
mates simultaneously.  It  is  a nonlinear  process  which  linearized  about 
a guess  trajectory  and  is  iterated  to  convergence  before  editing.  The 
neasurcment  model  for  the  a*h  observation  at  the  ith  time  point  is 

Zia  * ha<xi)  * cia 
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where  e^Q  is  random  noise  with  zero  mean  and  o^Q  variance. 
Solve  for  x by  minimizing  the  weighted  sum  of  squares 


y y pi.-WV 

i»l  ael^  \ °ia  / 


with  respect  to  x^. 

b.  Outlier  Detection 

(1)  At  each  time  point  i,  for  each  observation  a in  the  solution  a 
normalized  residual  is  computed 

♦ ZL-ha<ii> 

2 

where  a (r^a)  is  the  estimated  residual  variance  approximated  by 
°Vla)  • •?,  * Ha(HTm)-‘  H* 

..  »V*i> 


htwh  - [ -S-1 

aeIl  °ia 

t * I 

If  3<|r^Q|<5,  the  respective  observation  is  deleted  temporarily. 

If  I r^a I >5 , the  respective  observation  is  deleted  permanently. 

If  either  of  these  tests  reject  any  observations  the  solution  is 
iterated  to  convergence  with  the  remaining  observations  and  tested  again. 
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This  test  indicates  those  observations  whose  residuals  are  not  consistent 
with  their  variance  and  geometry. 

(2)  When  no  more  observations  are  rejected  with  the  previous  test,  a 
sun  of  weighted  residuals  for  each  observation,  over  ell  the  time  points 
it  was  processed  is  computed. 


‘a"  i jael^  ria 


If  max  |Rft|>3,  all  of  the  a*h  observations  are  deleted  from  all  further 

processing,  all  temporarily  deleted  observations  are  enabled  and  the  whole 
process  is  reiterated.  This  test  indicates  a consistent  bias  in  an 
instrument's  set  of  observations. 

4.  RECURSIVE  PROCESSOR 

a.  Process.  This  is  an  extended  Kalman  filter  which  produces  state 
vector  (position , velocity,  acceleration)  estimates  sequentially. 
Observation  variance  estimates  are  also  produced  sequentially.  The  pre- 
dicted state  estimate  is 

x(k+l  |k)  - F(k)x(k) 

the  corrected  state  estimate  is 

x(k+l)  « x(k+l |k)  ♦ K(k)r(k+1 |k) 

where  K(k)  is  the  Kalman  filter  optimal  gain  matrix  and 

r(k+l |k)  - Z(k+1)  - h(k*l)x(k+l |k)  • 

is  the  vector  of  observation  residuals. 

The  variance  estimate 

2 

°i(k*l)  - T7-Qi(k>l) 


is  a steady  state  function  of  the  exponentially  weighted  sum  of  squared 
residuals 
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Q^k+l)  - wCQ^IO^Ck+lIk)] 

0<w<l 

b.  Outlier  Detection.  For  each  observation  i at  time  k+1 , a two 
level  outlier  detection  scheme  is  used  on  the  normalized  residual 


r*(k+l|k) 


^(k+l  |k) 


o2(r.)  * oj(k)  ♦ 


H. 


3hi(x) 

3x 


P is  the  state  covariance  matrix.  . 
* 


(1)  If  r^(k+l|k)>12  reject  the  i^h  observation  for  time  k+1, 

(2)  If  4<ri(k+l |k)<12  updato  (^(k+1). 

(3)  If  0<r*(k+l |k)<4  update  Q^Ck+l),  o2(k+l)  and  x(k+l|k). 
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ABSTRACT.  The  development  of  simulations  of  physiological  systems  has  been 
used  as  a guide  in  the  design  of  animal  experimentation  used  to  study  such  en- 
docrine functions  as  glucose-insulin  interaction  and  testosterone  dynamics. 

Models  of  pulmonary  respiratory  function  have  been  studied  in  Bn  effort  to 
redesign  several  pulmonary  function  teats  so  that  particular  system  parameters 
could  be  evaluated  directly  from  test  results. 

Model  development  is  thus  a useful  procedure  in  studying  physiological 
systems,  for  it  focuses  attention  on  the  cause-effect  relationship  at  each 
stage  of  the  homeostatic  process,  and  thus  integrates  In  a systematic  way  all 
that  is  known  about  a particular  system.  In  addition,  the  requirements  and 
constraints  of  the  model  development  clearly  point  out  gaps  in  our  knowledge 
of  overall  system  function,  and  in  an  effort  to  obtain  this  missing  data  one 
can  utilize  the  model  structure  in  designing  the  necessary  experimental  proto- 
cols. The  results  of  these  experiments  will  help  complete  the  model  in  a 
physiological  meaningful  way,  and  once  complete,  the  model  can  be  used  to  study 
the  effects  of  parameter  variation  on  system  response  under  both  normal  and 
pathological  situations. 

The  simulation  can  be  used  in  conjunction  with,  and  as  a supplement  to, 
animal  experimentation.  For  example,  the  large  number  of  extraneous,  and  possibly 
even  unknown,  factors  which  often  obscure  or  invalidate  the  results  of  live 
animal  experiments  are  not  present  in  the  model.  The  model  user  must  be  able 
to  take  advantage  of  the  resulting  simplified  approach  to  the  physiological 
system,  but  must,  at  the  same  time,  be  careful  not  to  oversimplify  the  complex 
physical  interrelationships  to  the  point  at  which  the  results  are  physiologically 
meaningless. 


This  presentation 
of  model  development  in 
subsystem  operation  and 


will  utilize  several  case  studies  to  demonstrate  the  use 
designing  experiments  to  study  overall  system  function, 
compartment  analysis,  and  parameter  evaluation. 


1 
t»  - 

i 


i 

1 

I 
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PrececHm  pagi  Hank 


1.  INTRODUCTION.  Model  development  is  a useful  procedure  in  studying  physiological 
control  systems,  for  It  focuses  attention  on  the  cause-effect  relationship  at  each 
step  of  the  control  process,  end  integrates  in  s systematic  way  all  that  la  presently 
known  about  the  particular  system.  Models  can  be  presented  in  many  different 
modes,  some  of  which  might  be  scaled  versions  of  the  actual  system,  physical  analogs 
consisting  of  hsrdware  elements  or  alternative  living  systems,  and  both  analog  or 
digital  computer  simulations.  The  emphasis  in  this  presentation,  however,  will  be 
on  the  mathematical  descriptions  of  system  function  and  the  computer  simulations 
of  these  relationships*  In  particular,  the  application  of  models  in  research, 
teaching,  end  the  design  of  experiawnts  will  be  discussed  in  terms  of  specific 
examples  of  endocrine  and  respiratory  function. 


Early  application  of  the  control  engineer's  approach  to  phyalologlcal  system 
studies  appeared  in  the  work  of  Grodlns*  and  Starkb  in  their  studies  of  respiratory 
function  and  pupillary  motion,  respectively  (1,2).  Grodlns*  first  model  of  respira- 
tory function  divided  the  body  into  two  compartments,  the  lungs  and  the  remaining 
tissue.  In  addition,  he  Assumed  that  control  of  respiration  was  purely  a function 
of  carbon  dioxide  concentration  at  particular  sites  within  the  circulation.  Circu- 
lation time  was  also  assumed  to  be  negligible.  Validation  studies  were  then 
performed  on  the  model,  at  which  time  model  results  were  compared  with  known 
experimental  results  from  a living  system.  Deviations  between  the  model  and  the 
living  system  suggested  several  additions  to  the  model,  which  Grodlns  incorporated 
in  subsequent  more  complex  representations,  A second  model  Included  circulation 
time  as  a non-negllgible  parameter,  and  added  the  effect  of  alveolar  dead  apace 
to  the  two -compartment  study.  This  more  advanced  model  waa  able  to  be  used  to 
study  both  normal  respiratory  function  and  the  abnormal  behavior  associated  with 
Cheyne-Stokes  breathing0.  A third  model  added  the  brain  compartment  to  the  original 
structure,  and  also  Included  the  effect  of  oxygen  concentration  on  respiratory 
control.  The  Grodlns  models  illustrate  one  approach  of  model  building,  which 
begins  with  a simple,  but  non-trlvlal,  model  and  adds  additional  complexity  to  make 
the  model  results  agree  with  the  results  of  physiological  experimentation. 

Stark,  on  the  other  hand,  used  the  modeling  approach  in  designing  his 
experimental  protocol  to  study  pupillary  diameter  as  a function  of  light  incident 
to  the  eye.  He  used  a qualitative  description  of  the  system  to  develop  a block 
diagram  representing  the  functional  portions  of  the  pupillary  control  mechanism. 
Available  data  could  then  be  used  to  describe  quantitatively  the  overall  closed 
loop  system,  but  it  could  not  be  used  to  develop  the  mathematical  relationships 
between  the  subsystem  variables  within  the  closed  loop.  Stark  then  designed  an 
experiment  which  would  produce  the  necessary  information  on  open  loop  response  in  an 
in  vivo,  physiologically  undisturbed  human  subject.  Incident  light  was  focused  at 
the  plane  of  the  iris  so  that  the  cross  section  of  light  entering  the  eye  was  less 
than  the  smallest  pupil  diameter.  Incident  light  intensity  and  pupil  response  were 


first  published  in  1954 
first  published  in  1959 
Cheyne-Stokes  breathing; 
(tidal  volume) 


periodic  increase  and  decrease  in  depth  of  breathing 
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then  recorded  with  an  Infrared  electro-optical  arrangement,  from  which  frequency 
reaponae  curvea  could  be  developed.  Tranafer  functions  for  the  open  loop  eyeteai  * 

were  then  constructed  and  a mathematical  description  of  the  overall  system  was  thus  I 

determined.  Stark  thus  used  a modeling  approach  to  describe  the  information  flow 
through  the  system,  and  to  see  how  available  data  could  be  used  to  quantitatively  * 

describe  system  function.  When  such  descriptions  could  not  be  developed,  the  • 

structure  and  suggested  cause-effect  pathways  within  the  model  could  be  used  to  aid 
in  the  design  of  an  experiment  which  would  produce  the  specific  information  necessary 
for  system  quantification.  Although  this  procedure  was  satisfactory  in  the  case 
of  pupillary  dynamics,  it  is  not  always  possible  to  satisfy  model  requirements 
within  physiological  constraints.  However,  the  modeling  approach  does,  as  a minimum, 
suggest  guidelines  for  experimental  design  which  would  result  in  the  necessary 
input-output  analytical  relationships  between  system  variables. 

2.  APPLICATION  OF  WOOLS.  Models  of  physiological  systems  have  been  uaod  in  research,  < t 

teaching,  and  the  design  of  experiments.  There  are  two  distinct  steps  Involved  in 

applying  the  modeling  approach  to  experimental  deaign.  In  developing  the  model, 

areas  where  the  available  data  are  not  adequate  to  explain  the  operation  of-  the 

system  will  become  clarified,  and  a study  of  the  flow  of  information  necessary  to 

completely  implement  the  model  will  suggest  tests  and  experimental  procedures  for 

generation  of  additional  data.  Such  an  example  was  discussed  previously  in  the 

description  of  Stark's  work  Then,  once  the  model  has  been  developed,  it  may  offer 

a desirable  alternative  to  living  system  experiments,  where  preparation  time  may 

be  many  hours,  months,  or  days,  and  where  surgical  or  chemical  intervention  may 

cause  undesirable  aide  effects.  Such  experiments  can  be  Implemented  on  the  model, 

generally  with  little  difficulty  and  little  loss  of  time.  The  model  can  be  used 

to  "zero  in"  on  a best  experimental  protocol,  saving  the  animal  experimentation 

for  the  final  stagea  of  exploration.  Thus  the  model  doea  not  replace  the  need  for 

animal  experiments  to  finally  validate  methods  and  conclusions,  but  simply  serves  as 

a "short  cut"  to  the  final  procedure,  providing  an  easier,  less  expensive,  and  less 

tine  consuming  alternative  in  the  overall  investigation. 

The  model  can  also  be  used  to  predict  the  effect  of  system  changes  and  system 
sensitivities  to  structural  and  component  changes.  Using  the  model,  it  in  a rela- 
tively simple  matter  to  propose  parameter  alterations,  and  to  observe  the  relative 
significance  of  these  changes  *n  the  operating  characteristics  of  the  total  system,  as 
well  as  the  sensitivity  of  the  system  to  these  changes.  This  is  possible  even  for 
variables  and  parameters  which  cannot  be  observed  directly  in  the  physiological 
environment.  This  capability  has  important  research  and  clinical  applications,  since 
it  can  provide  a means  for  evaluating  the  probability  of  existence  of  various 
pathological  states  and  may  possibly  suggest  the  etiology  of  a particular  disease. 

The  physiological  model  can  also  serve  as  an  effective  adjunct  in  the  training 
of  bioengineers  and  medical  scientists.  The  model  can  present  problems  in  physio- 
logical dynamics  in  terms  of  cause -and -effect  relationships  between  functioning 
parts  of  the  system  and  total  system  operation.  For  example,  it  can  be  used  to 
study  the  response  of  pathological  states  to  various  treatments.  One  important 
attribute  of  such  a model  is  that  a "patient"  can  be  constructed  with  any  desired 
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pathological  condition,  and  the  student  can  be  exposed  to  this  patient  in  such 
the  sane  way  as  he  would  explore  a clinical  case.  Thus  the  student  can  investigate 
siany  varieties  of  disease  states,  propose  and  validate  a host  of  possible  treatment 
protocols,  and  develop  conceptual  info nation  about  pathological  dynamics,  all  In 
a single  nodel  of  the  physiological  systen  of  Interest.  At  present,  however,  such 
conputerlsed  node Is  of  phy si o logical  systen  dynanics  are  not  generally  available, 
but  tutorial,  inquiry -response  and  steady -state  alnulatlons  are  available  and 
finding  growing  acceptance  in  the  educational  community. 


3.  CgYlLOPmUT  OT  MODELS . The  development  of  a nodel  can  be  broken  down  into  four 
phases.  These  are  block  dlegran  formulation,  data  collection,  mathematical  description 
of  the  data,  and  computer  simulation.  The  first  step  la  the  development  of  a block 
dlagran  based  on  the  known  physical  principles  of  the  system  operation.  This 
dlagran  should  display  the  important  characteristics  of  the  system.  This  diagram 
nay  be  too  conplex  for  initial  simulation  since  it  will  probably  include  secondary 
functions  which  are  not  critical  to  overall  performance.  In  addition,  the  diagram 
may  contain  physiological  variables  whose  quantitative  relationships  are  either  not 
available  in  the  literature  or  are  extremely  difficult,  if  not  impossible,  to 
determine  by  physiological  experinentation.  Therefore  a revised  "simplified"  block 
dl'agran  must  be  developed.  This  is  generally  a qualitative  description  of  system 
behavior,  and  at  this  point  quantitative  relationships  nust  be  obtained. 


Physiological  experiments  nust  now  be  performed  in  order  to  derive  dynamic 
input-output  relationships  for  each  block  of  the  model,  unless  these  data  are 
already  available  from  prior  work . Static  characteristics  may  provide  useful 
information  for  uodol  development,  but  they  cannot  provide  the  information  neceaaary 
for  a complete  description  of  system  behavior.  The  design  of  the  experiments  should 
consider  the  particular  subject  (e.g.,  human,  dog,  rat,  etc.),  observation  times 
based  on  system  response  times,  quality  and  availability  of  data  analysis  and 
processing  techniques  (e.g.,  chemical  assays),  effect  of  the  procedures  on  altering 
system  physiology  (e.g.,  surgical  and  chemical  intervention),  and  overall  cost 
of  the  procedure.  Thus  the  block  diagram  nodel  acts  as  a guide  in  designing  the 
physiological  experiments. 


In  order  to  use  the  experimental  data,  a mathematical  description  of  the 
data  must  be  obtained.  These  may  be  functions  of  time  when  considering  system 
dynanics.  If,  for  example,  the  blocks  of  the  model  are  asauned  to  represent  linear 
subsystems  or  linearized  approximations  to  non-linear  operation,  the  final 
mathematical  representation  for  each  block  will  be  a transfer  function  T(s)*  , 

where  Y(s)  and  X(a)  are  the  Laplace  Transforms  of  the  output  and  input,  *TaT 

respectively,  of  the  block.  The  time-domain  description  of  these  functions  may 
be  obtained  using  curve-fitting  techniques. 


This  overall  mathematical  structure  can  be  slnulated  on  an  analog  or  digital 
computer  as  an  aid  in  exploiting  the  model.  Once  a simulation  is  developed  both 
normal  and  pathological  cases  can  be  investigated  by  changing  either  potentiometer 
settings  (analog  simulation)  or  data  values  (digital  simulation).  Both  analog  and 
digital  computers  have  advantages  and  disadvantages  in  their  application.  The 
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analog  computer  Is  the  most  direct  form  of  simulation  since  the  basic  operations 
such  as  Integration  and  multiplication  are  carried  out  continuously  in  either  real 
time  or  o directly  scaled  version  of  real  time.  The  disadvantages  of  this  form  of 
simulation  are  the  necessity  for  amplitude  end  time  scaling,  and  the  complexity  of 
the  wiring  or  patching  which  occurs  as  the  order  of  the  system  increases.  Digital 
computer  implementation  on  either  large  scale  machines  (e.g.,  IBM  370)  or  small 
scale  minicomputers  (e.g.,  DSC  PDP-8)  Is  another  route  for  computer  modeling.  The 
simulation  languages  available  for  use  on  these  machines  (CS1IP,  MIDAS,  ISI*/8)  provide 
a direct  method  for  simulating  an  analog  computer  on  the  digital  computer  facility 
without  the  drawbacks  of  patching  wires  or  time  and  amplitude  scaling.  Disadvantages 
of  large  digital  computer  simulation  are  the  general  unavailability  of  on-line  inter- 
active operation  of  the  simulation  languages  and  long  turn-around  times.  Using  a 
minicomputer  can  avoid  these  difficulties,  but  limited  computer  availability  may  be 
a problem.  However,  as  costs  decrease  and  machine  capability  increases  minicomputers 
are  becoming  more  widely  available  in  biomedical  research  and  education  facilities. 

4.  CASK  STUDIES.  Three  case  studies  will  be  presented  to  demonstrate  tbe  use  of 
model  development  in  designing  experiments  to  study  overall  system  function,  sub- 
system operation,  and  parameter  evaluation.  In  particular,  the  glucose-insulin  and 
testosterone  endocrine  systems,  and  the  respiratory  system  will  be  discussed. 

4A .OUUCOBE-INSULIN  HOMEOSTASIS.  The  development  of  the  glucose-insulin  model 
demonstrates  the  use  ol  modeling  in  the  design  of  experiments  m a situation  similar 
to  that  of  Stark's  approach  to  pupillary  dynamics  (3,4)..  The  glucose  homeostatic 
system  consists  of  a complex  interaction  between  subsystems  regulating  hormonal 
release,  glucose  storage,  and  glucose  utilization.  Eech  such  perfusion  region  can 
be  viewed  as  a combination  of  controller  and  plant  working  together  to  control 
glucose  and  insulin  levels.  The  pancreas  and  liver  may  be  considered  primary 
controllers  due  to  their  function  under  both  hypoglycemic  and  hyperglycemic 
conditions,  while  plant  function  is  represented  by  peripheral  tissue  activity. 

A block  diagram  of  the  primary  Interacting  mechanisms  of  glucose-insulin  control 
is  presented  in  Fig.  1. 

Although  a quantitative  description  of  tqtal  system  function  can  ue  obtained 
from  overall  input-output  measurements  (e.g.,  system  plasma  responses),  a clear 
understanding  of  individual  subsystem  function  and  Interaction  within  the  Intact 
closed  loop  system  can  only  be  obtained  if  each  block  is  Itself  described  quantl- 
tively.  The  modeling  approach  emphasizes  this  fundamental  observation,  and  focuses 
one's  attention  on  those  experimental  procedures  which  will  yield  the  input -output 
data  necessary  for  subsystem  development  in  a dynamic  sense.  Total  system  response 
data  is  widely  available  in  the  literature.  For  example,  fundamental  glucose 
tolerance  test  results  can  be  used  to  relate  system  glucose  response  to  glucose 
input  over  the  time  base  of  the  test.  However,  the  data  needed  to  describe  each 
physiological  block  in  the  figure  is  not  generally  available.  A study  of  the  model 
led  to  the  development  of  an  experimental  protocol  which  satisfied  both  modeling 
requirements  and  physiological  constraints  involved  in  monitoring  system  variables 
for  glucose-insulin  control.  Simultaneous  Input  and  output  plasma  concentrations 
for  glucose  and  insulin  were  obtained  for  the  liver,  pancreas,  and  periphery  over 
a fixed  time  sequence  following  glucose  and  insulin  stimulus,  respectively.  These 
data  were  used  to  derive  mathematical  functions  describing  input  and  output  dynamics 
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for  each  block  of  the  closed  loop.  A set  of  normoglycemic  glucose  nnd  Insulin 
concentration  curves  in  response  to  n g’ucose  loud  are  sif-n  in  Fig.  2.  The 

Impulae-like  glucose  load  drives  the  total  system  Into  a temporary  hyperglycemic 
condition,  which  elicited  a pancreatic  insulin  response.  These  experimental 
results  indicate  an  overreacting  pancreatic  insulin  output,  which  is  mediated  by 
hepatic  insulin  clearance.  Glucose  levels  rose  very  rapidly  throughout  the  system, 
but  began  to  decrease  as  insulin  levels  increased.  Glucose  concentrations  returned 
to  normal  resting  levels  in  a decaying  oscillatory  pattern,  as  would  be  expected  of 
an  underdamped  higher-order  system. 

The  curves  of  Fig.  3 and  4 describe  arterial  and  hepatic  concentration  of 
glucose  and  insulin  following  insulin  loading.  The  additional  parameter  of  elapsed 
time  after  surgery  is  also  Included  in  these  figures.  The  early  poBt-operatlve 
(2  hours  after  surgery)  response  is  more  sensitive  and  less  stable  than  the  late 
post -operative  (between  2 and  14  days  after  surgery)  response.  Arterial  glucose 
levels  decrease  almost  70%  from  resting  levels  and  return  more  slowly  in  the  EPO 
than  the  LPO  cases.  Similarly,  hepatic  settling  time  is  much  greater  in  the  EPO 
case.  It  is  also  initially  highly  oscillatory,  perhaps  indicating  o very  sensitive, 
lightly  damped  system.  Such  differences  between  the  EPO  and  LPO  cases  suggest  a 
possible  test  for  degree  of  recovery  after  surgery. 

Thus,  the  modeling  procedures  have  been  used  os  n guide  in  the  design  of  an 
experimental  protocol  which  was  used  to  obtain  the  data  necessary  for  determining 
true  in-vivo  relationships  between  subsystem  variables.  In  addition,  these  sub- 
system studies  have  Indicated  the  possibility  of  developing  additional  diagnostic 
criteria  baaed  on  dynamic  glucoae  subsystem  response. 

4B. TESTOSTERONE  DYNAMICS.  As  another  example  of  modeling  of  physiologic  systems, 
the  testosterone  system  is  considered  (5,6).  Testosterone,  the  male  sex  hormone, 
gives  the  male  his  secondary  sexual  characteristics  such  ns  hair  distribution, 
skin  texture  and  voice  quality.  Fig.  5 represents  a complete  block  diagram  for 
the  testosterone  control  system.  Testosterone  is  secreted  by  the  gonads  and  adrenal 
cortex  and  is  produced  peripherally  through  conversion  of  precursors.  Hypothalamus- 
pituitary  activity  provides  the  primary  control  of  testosterone  secretion  through 
the  action  of  releasing  factors  and  the  hormones  FSH,  LH  nnd  ACTH*1.  In  conjunction 
with  this,  testosterone  removal  mechanisms  such  as  tissue  storage  and  metabolism 
determine  blood  testosterone  concentration. 

This  block  diagram  contains  several  effects  which  can  be  considered  "second 
order".  These  include  FSH  control,  testosterone  secretion  and  the  "short  feedback" 
pathway  in  which  the  hypothalamus  secretion  of  releasing  factors  is  controlled  by 
the  blood  FSH  Bnd  LH  concentrations.  As  described  earlier,  this  total  qualitative 
model  ia  considered  too  complex  for  use  in  the  initial  modeling  effort.  A simplified 
block  diagram,  shown  in  Fig.  6 was  developed  in  which  second  order  effects  were 
eliminated . 

As  in  the  glucose -insulin  case,  an  experimental  protocol  was  developed  to 
obtain  mathematical  descriptions  of  each  block  of  the  figure.  As  an  example,  to 
mathematically  describe  the  testosterone  disappearance  block,  an  experiment  was 

d.  FSH;  Follicle-stimulating  Hormone 
LH:  Luteinizing  Hormone 

ACTH:  Ad renocortico trophic  Hormone 
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designed  in  which  radioactively  labelled  testosterone  was  rapidly  injected 
intravenously  into  s rat  and  blood  samples  were  obtained  at  specific  times 

following  Injection.  These  blood  samples  were  analysed  for  radioactivity  and 
the  resulting  data  Is  shown  in  Fig,  7.  Since  the  experimental  procedure  limits 
all  input  excitations  to  small  perturbations  about  normal  circulatory  steady  state 
levels,  the  model  can  be  considered  to  be  linear.  Thus,  the  curve  of  Fig.  7,  which 
is  the  "step  response"  of  the  testosterone  disappearance  block,  can  be  used  to 
generate  a transfer  function  for  this  subsystem.  The  analog  simulation  of  this 
transfer  function  is  shown  in  Fig.  8.  Similar  procedures  lead  to  transfer 
functions  and  simulations  for  the  other  blocks  of  the  model. 

Once  a working  simulation  is  developed,  experiments  are  performed  on  the 
model  to  validate  its  performance  characteristics  and  to  improve  knowledge  of 
system  behavior.  This  additional  information  can  be  used  to  croate  a more  refined 
model.  If  little  quantitative  information  is  available,  experiments  on  the  model 
may  suggest  physiological  experiments  to  be  performed  to  obtain  such  information. 

The  open  loop  response  of  each  block  of  the  testosterone  model  compared  favorably 
with  experimental  results.  Closed  loop  teats  were  then  performed  on  the  model. 

As  an  example,  consider  exciting  the  model  with  a step  of  voltage  at  the  input 
of  the  testosterone  disappearance  block.  This  corresponds  physiologically  to 
a rapid  Intravenous  injection  of  testosterone  at  times  t=0.  Responses  are  observed 
at  the  outputs  of  the  LH  disappearance  and  testosterone  disappearance  blocks, 
corresponding  physiologically  to  the  blood  Ui  and  testosterone  concentrations, 
respectively.  The  results  are  shown  in  Fig.  9,  which  displays  the  deviations  from 
baseline  of  these  curves.  As  can  be  seen,  the  blood  testosterone  level  begins  at 
the  injected  level  and  returns  to  baseline  with  some  oscl llatl  >n  within  24  hours 
after  injection.  The  blood  1M  concentration  begins  below  baseline  in  order  to 
compensate  for  the  Increased  testosterone  level.  The  LH  concentration  then  re- 
turns to  baseline,  again  with  a slight  oscillation,  within  24  hours  after 
injection. 

These  results  are  as  expected  using  a qualitative  knowledge  of  system  behavior, 
but  there  are  no  quantitative  physiological  data  available  with  which  to  check 
the  results.  It  is  therefore  necessary  to  perform  physiological  experiments 
to  generate  such  quantitative  data. 

4C.  RESPI RATORY  FUNCTION.  A digital  computer  simulation  of  respiratory  function 
has  been  developed,  based  on  the  block  diagram  representation  of  Fig.  10  (7,8,9). 

This  diagram,  unlike  that  of  the  original  Orodin's  model,  Includes  all  that  is 
known  about  respiratory  function  and  control,  at  least  in  n qualitative  sense. 

Once  the  overall  system  is  developed,  each  subsystem  must  be  described  individually, 
and  the  appropriate  interaction  must  be  included  so  that  the  combined  subsystems 
response  to  a simulated  physiological  input  such  as  intrapleural  pressure  would 
closely  resemble  those  of  the  living  system.  Just  as  In  the  glucose-insulin  study, 
the  overall  complex  model  was  initially  developed  qualitatively,  and  then  each  sub- 
system was  studied  individually  and  described  mathematically.  Unlike  the  glucose 
case,  an  experimental  protocol  was  not  necessary,  since  each  block  was  described 
from  the  basic  physics  of  the  system  function,  and  the  specific  parameter  values  were 
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already  available  in  the  literature.  Of  particular  intereat  is  the  Interconnection 
of  the  aubayeteaia  representing  respiratory  Mechanics,  alveolar  nixing  of  respiratory 
gases,  and  dif fusion  between  the  alveolar  space  and  the  pulmonary  capillary  bed. 

A simplified  version  of  the  mechanics  section  is  shown  in  Fig.  11.  This  isodel 
includes  the  trachea-bronchi  resistive  pathway  and  the  storage  conpartnent  of  the 
lung.  Also  shown  in  the  figure  is  the  program  listing  used  to  represent  the  mechanics 
system  dynamics.  The  program  was  written  in  the  I8L/8 0 simulation  language  on  a 
EEC  Pitt -81  minicomputer,  typical  results  of  this  simulation  are  shown  in  Fig.  12. 

A more  detailed  model  of  respiratory  mechanics  has  also  been  developed.  It  includes 
trachea  resistance,  non-linear  bronchial  resistances,  and  non-linear  bronchial  and 
lung  compliancea.  In  addition,  it  includes  flexible  airway  (bronchial)  tissue 
inertance.  The  effect  of  aimray  inertance  on  overall* system  function  hss  been 
questioned  in  previous  studies^  The  inertance  parameter  in  not  easily  measured  or 
changed  in  the  actual  living  system,  but  tt  can  easily  be  varied  in  the  computer 
simulation.  This  was  done  on  the  detailed  mechanics  model  and  the  results  are  shown 
; in  Fig.  13.  This  figure  represents  air  flow  into  the  lung,  with  inertance  values 

[ as  a parameter  of  the  study.  The  curves  indicate  that  inertance  variation  has  no 

! effect  on  the  overall  flow  characteriatlcs  of  curve  shape  and  timing,  but  does  have 
| a small  effect  on  maximum  and  minimum  levels  of  total  respiratory  flow.  Thus  the 

( model  has  been  used  as  a subject  of  an  experimental  procedure  when  the  ectual 

j experiment  on  a living  system  was  not  possible.  Of  course,  the  results  and  conclusions 

• of  such  an  experiment  can  only  be  as  good  as  the  model,  and  thus  the  validity  of 

| the  model  must  be  detemlned  prior  to  such  experimentation. 

f 

i 

: The  diffusion  model  can  also  be  used  to  illustrate  an  application  of  modeling 

| to  experimental  design' and  parameter  evaluation.  Fig.  14  represents  the  general 
| model,  consisting  of  a single  compartment  lung  and  multi compartment  pulmonary 

[ capillary  bed.  Unlike  previous  models  of  pulmonary  gas  diffusion,  however,  this 

model  represents  the  oxygen -hemoglobin  interaction  within  the  pulmonary  blood  as 
! a storage  (and  hence  capacitance)  phenomenon,  and  not  as  a diffusion  resistance. 

| This  difference  In  model  concept  suggested  looking  at  the  standard  laboratory  test 
used  to  evaluate  diffusion  capacity  (a  real stance -like  element),  and  to  develop 
variations  of  this  standard  teat  to  see  If  the  new  approach  is  really  a reasonable 
one.  Variations  of  breath-holding  time  in  the  single-breath  teat  resulted  in  a 
diffusing  capacity  parameter  which  decreased  linearly  with  breath  holding  time, 
when  plotted  on  semi>log  paper.  This  variation  in  "diffusing  capacity"  can  be 
explained  using  the  hemoglobin  storage  concept  developed  for  this  model,  but  ia 
not  easily  explained  using  the  original  concept  of  diffusion  resistance.  Thus, 
a modeling  approach  could  be  used  to  develop  Improved  interpretations  of  standard 
clinical  tests.  This  is  yet  another  application  of  physiological  modeling. 


5.  CONCLUSIONS . This  presentation  has  utilized  several  case  studies  to  demonstrate 
the  use  of  model  development  in  designing  experiments  to  study  overall  system 
function,  subsystem  operation,  and  parameter  evaluations.  In  particular,  the 
glucose-insulin  system,  testosterone  system,  and  respiratory  system  were  discussed. 

e.  ISL/8;  an  Interactive  Simulation  Language  developed  for  *he  DEC  PDP-8 

minicomputer  by  Interactive  MinisystemB , Inc.,  Kennewick,  Washington  99336. 


306 


l 


REFERENCES 


1.  Orodins,  F.S.,  J.8.  Gray , K. R.  Bchroeder,  A.L.  Korin*,  R.W.  Jones, 

"Respiratory  Responses  to  CO2  Inhalation:  a Theoretical  Study  of  a Nonlinear 
Biologloal  Regulator",  J.  Appl.  Phyaiol.  7:283,  1954. 

2.  L.  Stark,  "Stability,  Oacillationa,  and  Noiae  in  the  Hunan  Pupil  Servomechanism", 
Proc.  IRS  47:1925,  1959. 

3.  Pinkelatein,  S.M.,  M.  Bleicher,  8.  Batthany,  J.  Tiefenbrun,  "In-Vivo  Modeling 
for  Olucoae  Homeoatasis",  I EBB  Trans,  on  Blomed.  Bng.  22:47,  1975. 

4.  Tiefenbrun,  J.,  S.M.  Pinkelatein,  W.  Shoemaker,  "Glucose  Homeostasis  and 
the  Glucose-insulin  Feedback  System  in  the  Postoperative  State",  Adv.  Exp. 

Med.  Biol.  33  209,  1973. 

5.  Batsman,  S.S.,  ’Vlathematical  Model  of  the  Testosterone  Control  System", 

Ph.D.  Dissertation  in  Bioengineering,  Polytechnic  Institute  of  New  York, 

1974. 

6.  Batsman,  S.S.,  S.M.  Pinkelatein,  "An  Example  of  the  Use  of  Computer  Simulation 
in  Physiologic  Systems",  Trans,  (temp.  Educ.  Div.,  ASBB  7:13,  1975. 

7.  Pinkelatein,  S.M.,  W.  Blesser,  L.  Braun,  H.  Lyons,  "Simulation  of  the  Effects 
of  Airway  Tissue  Inertance  on  Respiratory  Mechanics",  Proc.  25th  ACKMB 
14:364,  1972. 

8.  Braun,  L.,  W.  Blesser,  S.  Pinkelatein,  H.  Lyons,  "A  Model  of  Gas  Diffusion 
in  the  Pulmonary  System",  Proc.  26th  ACKMB  26:354,  1973. 

9.  Albergoni , V.,  C.  Cobelll,  0.  Forresin,  "interaction  Model  Between  the 
Circulatory  and  Respiratory  Systems”,  IEEE  Trans,  on  Bloated.  Eng.  19:108, 

1972. 


307 


ARTERIAL  PLASH  A INSULIN 


Ln 

T 

l 

l 


f- 

§■- 


Arterial  plasma  glucose  and  Insulin  respose  to  an 
Insulin  load,  for  early  and  late  postoperative  studies 

Fig.  3, 
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Hepatic  glucose  response  to  insulin  lording  for 
early  and  late  post  operative  studies. 
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Major  control  pathways  of  testosterone  concentration 

Fig.  5. 
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Simplified  Block  Diagram  of 
Testosterone  Control  System 

Fig.  e. 
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I A DESIGN  FOR  THE  DETECTION  OF  SYNERGY  IN  DRUG  MIXTURES 

P.  V.  Plserchla 
B.  V.  Shah 

Raaaarch  Triangle  Inatituta 
Poat  Office  Box  12194 
Raaearch  Triangle  Park,  North  Carolina 

ABSTRACT.  In  Biometrics  [September,  1969},  P.  S.  Hewlett  gives 
a definition  of  synergy  baaed  on  the  curvature  of  isobars  of  drug 
mixtures.  Specifically,  if  X(0)  and  Y(9)  represent  doses  for  two  drugs 
A and  B which  correspond  to  an  ED(6)  response  level  (i.e.,  a proportion 
0 of  all  individuals  tested  will  show  the  specified  response)  and  if 
(XX(9),  (l-A)Y(O))  represents  a dose  of  a mixture  consisting  of  a pro- 
j portion  X of  X(6)  and  (1-A)  of  Y(6),  then  synergy  is  absent  or  present 
! according  to  whether  the  proportion  F(A)  of  individuals  responding  to 
| the  dose  (AX(G),  (1-A)Y(6))  equals  or  exceeds  6 for  various  values  of 
| X;  that  is, 

j 

j P(A)  > 0 for  some  X implies  synergism. 

j 

| An  immediate  consequence  of  this  definition  which  we  prove  is: 

| Suppose  Xq  and  Yq  are  two  doses  (not  necessarily  equivalent) 

! of  A and  B.  Consider  the  straight  line  connecting  Xq  and  Yq 

and  written  as  X ■ XXQ,  Y - (1-X)  Yq,  0 s X S 1,  Then,  if 
| there  exists  a Xn  such  that 

j P(y  - P(X0X0,  (1-Xq)  Yq)  > max{P(X0,0),  P(0,Yq)} 

i 

then  there  exists  a nonlirear  isobar  and,  hence,  synergy  is 
, Bhown  to  occur. 

The  Import  of  the  above  derives  from  the  fact  that  a test  for 
synergy  in  drugs  may  be  performed  with  as  few  sb  three  test  groups 
(those  receiving  Xq  alone,  those  receiving  Yq  alone  and  those  receiving 

(AqXq,  (1-Aq)  Yq))  and,  perhaps  more  important,  the  doses  Xq  and  Yq 

need  not  be  equivalent. 

I 

i 1.  INTRODUCTION  AND  DEFINITION  OF  SYNERGY.  In  this  paper,  we 

• shall  consider  the  effects  of  two  drugs,  combined  in  various  mixtures, 
on  the  responses  of  some  biological  system  or  organism.  The  principal 
question  of  Interest  is  whether  the  phenomenon  of  synergism  occurs. 
Following  Bushby  [1969],  we  say  synergy  between  two  drugs  occurs  when, 

1 acting  together,  they  evoke  the  same  response  as  when  they  act  sing- 
ly, but  at  lower  concentrations,  or  their  effects  interact  in  a fashion 
which  is  to  the  advantage  of  the  organism  by  producing  an  otherwise  un- 
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attainable  rlaa  In  biological  activity 


Bach  of  tha  above  concepta  la  related  to  the  nature  of  sons  me- 
chanlem  of  joint  drug  ection.  A aubatantlal  amount  of  effort  haa  been 
devoted  to  tha  conatructlon  of  nathenatlcal  and  atatlatical  aodala  for 
joint  drug  action  (aae  Plackett  and  Hewlett  [1967]  and  Aahford  and  Snith 
[1965]  for  a aultable  liat  of  rafarencaa).  However,  certain  aapecta 
of  thla  reaearch  appear  to  be  controversial  end  no  comprehensive  and 
overall  acceptable  model  exists.  One  reaaon  for  thla  la  due  to  the 
coaplex  Banner  in  which  the  ef facta  of  drug  mixturea  are  nanlfeated. 

To  uae  the  terminology  of  Hewlett  and  Plackett  [1959]  and  Plackett  and 
Hewlett  [1967],  the  joint  action  of  two  druga  may  be  alallar  or  die- 
aimllar  according  to  whether  the  primary  aitea  of  action  for  the  two 
druga  are  tha  aaaa  or  different.  Alternatively,  the  joint  action  may 
be  non-interactive  or  interactive  if  one  drug  haa  either  no  influence 
or  some  influence  on  the  biological  activity  of  the  other. 

These  distinctions  have  given  rise  to  four  aituationa  aa  deacribed 
in  the  following  tablet 


Similar 

Dlasimilar 

Non-Interactlve 

Simple  Similar 

Independent 

Interactive 

Complex  Similar 

Dependent 

Plackett  and  Hewlett  [1967]  further  indicate  that  one  criticiam  of 
the  above  claaaiflcatlon  la  that  the  "action  of  two  druga,  whether  in- 
teractive or  not,  may  in  aome  aenae  be  partially  similar;  similar  and 
dissimilar  actions  should  be  regarded  aa  at  opposite  enda  of  continuum 
of  biological  possibilities. " Within  this  context,  the  concept  of  syn- 
ergism is  primarily  related  to  whether  the  effects  of  drug  mixtures  la 
non-interactive  or  interactive  regardless  of  its  position  along  the  con- 
tinuum from  similar  to  dissimilar.  However,  part  of  the  controversy 
associated  with  this  topic  pertains  to  the  equating  of  no  synergism  to 
only  the  simple  similar  situation.  Hence,  although  there  do  exist  a 
number  of  methods  for  fitting  joint  action  models,  an  alternative 
approach  to  the  concept  of  synergy  which  is  widely  acceptable  to  most 
research  workers  is  required. 

As  a result,  Hewlett  [1969]  has  discussed  the  measurement  of  the 
potencies  of  drug  mixtures  in  termB  of  isobars,  a procedure  used  in 
pharmacology.  To  construct  an  isobar  for  two  drugs,  the  doses  of  the 
druga  ere  measured  respectively  on  actual  physical  scalea  (e.g.,  mg/cc) 
along  the  two  axes  and  hypothetical  points  representing  the  dose  pairs 
producing  a fixed  biological  response  are  plotted  (e.g,,  50%  of  tha  in- 
dividuals receiving  such  a drug  mixture  dose  evoke  some  specified  quantal 
response).  Of  course,  in  an  actual  situation  these  points  would  have 
to  be  determined  experimentally;  but,  to  elucidate  the  concept  we  shall 
presume  that  the  desired  set  of  points  is  already  known.  An  example 
is  shown  in  the  figure  below  where  the  fixed  points  on  the  two  axes 
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correspond  to  ths  dosss  for  ths  two  drugs  separately  which  lead  to  a 
50%  response  rate  among  the  tested  individuals. 


ED  (90) 


Drug  B 


Figure  I.  Hypothesized  Isobar  for  two  synergistic  drugs. 

The  curve  in  the  Figure  1 is  called  an  isobar.  If  it  is  a straight 
line,  then  one  says  that  the  two  drugs  show  "additive  action."  On  the 
other  hand,  if  it  falls  below  the  straight  line  connecting  the  two 
fixed  points,  then  one  says  that  synergism  (or  potentiation)  occurs. 

This  definition  tends  to  bypass  the  question  of  similarity  or  dissimilarity 
of  the  joint  drug  action  but  yet  is  consistent  with  lower  concentrations 
evoking  the  same  response  which  Bushby  [1969]  uses  in  describing  synergy. 

Hence,  throughout  the  remainder  of  this  paper,  synergy  will  be  viewed 
as  curvatlve  of  Isobars,  giving  rise  to  the  following  formal  definition 
of  synergy. 

Let  P(X,Y)  denote  the  proportion  of  individuals  responding  to  a 
mixture  of  drugs  A and  B,  where  X ■ X units  of  A and  Y ■ Y units  of  B. 

Assume  that  P(X,Y)  obeys  the  following: 

(a)  0 & P(X,Y)  s 1 tor  X i 0,  Y i 0, 

(b)  P(X,0)  and  P(0,Y)  are  continuous  and  monotonicslly 
nondecreasing  functions  of  X and  Y,  respectively. 

If  for  a specific  6 there  exists  an  X or  Y such  that  P(X,0)  ■ 9 
or  P(0,Y)  - 6,  denote  X as  X(8)  and  Y as  Y(9). 

Now,  suppose  there  exists  a combination  of  A and  B denoted  as 
(X*,Y*)  with  P(X*,Y*)  ■ 6*  (say),  then  the  combination  (X*,Y*)  is 
said  to  be  synergistic  if  one  of  the  following  conditions  holds: 

Condition  1:  If  neither  X(9*)  nor  Y<9*)  exist  then  (X*,Y*)  is  syn- 

ergistic if  0*  > P(X,0)  for  all  X and  0*  > P(0,Y)  for  all  Y. 

Condition  2:  If  either  X(0*)  or  Y(0*),  but  not  both,  exist  then  (X*,Y*) 
is  synergistic  if  X*  < X<0*)  and  0*  > P(0,Y)  for  all  Y,  or,  Y*  < Y(9*)  and 
9*  > P(X,0)  for  all  X. 
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Condition  3:  If  X<8*)  and  Y(8*)  both  exist  than  (X*,Y*>  la  synergistic 

If 

X*  + Y*  < , 

x(s*)  TTe*) 

Briefly,  condition  (1)  maintains  that  (X*,Y*)  la  aynargiatic  if 
an  otherwise  unattainable  rise  in  biological  activity  ia  achieved 
[Buahby,  1969].  Conditions  (2),  (3)  are,  formally,  Hewlett's  [1969] 
conditions  for  synergy. 

2.  IKPLICATIONS  OF  THE  DEFINITION.  An  immediate  consequence  of 
the  above  definition  is  the  following  theorem  and  proof. 

Theorem;  Suppose  XQ  and  YQ  are  two  doaea  (not  necaaaarily  equiva- 
lent) of  drugs  A and  B.  Consider  the  straight  line  joining  (Xq,0)  end 
(0,Y0)  and  written  as  X - XXQ,  Y - (1-A)YQ,  0 U s 1.  Then,  if  there 
exists  a Xq  such  that: 

80  - P(X0X0,  (1“A0)Y0)  > max{P(X0,0),  P(0,Y0>>, 

then  (XqXq,  (1-XqJYq)  Is  a synergistic  combination  of  A and  B. 

Proof : 

Case  1;  Suppose  neither  X(8g)  nor  Y(6q)  exist.  Then,  by  the  continuity 
assumption,  8g  > P(X,0)  for  all  X,  and,  8q  > P(0,Y)  for  all  Y. 

Hence,  (XqXq,  (1-X0)Yq)  ia  synergistic  by  Condition  1. 

Case  2:  Without  loss  of  generality  assume  X(8q)  exists  and  Y(8Q)  does 

not.  Then  again,  by  the  continuity  assumption, 

6q  > P(0,Y)  for  all  Y. 

Also,  P(X(eQ),  0)  - 0Q  > P(X0,0),  by  assumption,  and,  through  mono- 
tinicity,  X(0q)  > XQ. 

Therefore,  X(9Q)  > XQ  > XQX0  and  (XqXq,  (1-XQ)Y0)  is  synergistic  by 
Condition  2. 

Case  3;  If  X(0Q)  and  Y(0q)  both  exist  then,  0Q  - P(X(0Q),  0)  > P(XQ,0) 
and,  eQ  - P(0,Y(60))  > P(0,Yq). 

Hence,  by  the  monotinicity  assumption  we  have; 

X(60)  > XQ  and  Y(60)  > YQ. 

Therefore, 

AqX(6o)  > an<*  <1-VY(9o>  > 
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Therefore, 


i > Vo  , . . (1“VY0 

X0  X(0Q)  *nd  1_X0  Y(60)  * 


, (1-Ao)Yo  _ . 

X(80)  + Y(e0) 


and  (AqXqi  (I-Xq)Yq)  Is  synergistic  by  Condition  3. 

Graphically,  the  above  theorem  is  represented  in  Figures  2 and  3 


Notice  the  above  does  not  require  Xg  and  Yq  to  be  equivalent  doses; 

however,  it  does  require  that  max  P(A)  be  greater  than  both  end  points, 

A 

It  is  not  sufficient  to  show  P(A)  > AP(1)  + (1-A)  P(0).  An  example  should 
suffice. 

Consider  the  response  defined  by 

P(X,Y)  - log  (X+Y+l)  for  X + Y s e - 1, 

- 1 for  X + Y > e - 1, 

then,  the  Isobars  of  P{a,Y)  are  the  lines  X + Y ■ const.  Clearly, 
straight  line  isobars  r.:\d  by  definition  an  additive  mixture.  However, 
consider  the  response  along  any  line  of  the  form  X ■ AXq,  Y ■ (1-A)  Yq 

where  XQ  > Yq.  We  have, 

P(A)  - P(AX0,  (1-A)  Yq)  - log(AX0  + (1-A)  YQ  + 1) 

- lOg  (A  (Xq-Yq)  + Yq  + 1). 

Certainly,  P(A)  > AP(1)  + (1-A)  P(0)  for  every  0 < A < 1,  but  yet, 
by  definition,  the  mixtures  are  additive. 

Figure  4 gives  the  geometry  of  the  situation. 

3 . OPTIMAL  MIXING . Associated  with  but  not  equivalent  to  synergy 
is  the  concept  of  the  optimal  mixing  of  two  drugs. 


Figure  4,  A non-linear  , additive  drug  mixture. 
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We  a ay  two  drug*  h*v*  an  optimal  mixing  rata  if  thara  la  a rldga 
in  the  raaponaa,  P(X,Y),  in  a atraight  ling  diraction.  If  tha  projaction 
of  tha  rldga  onto  tha  (X,Y)  plan*  ia  a lina  Y - pX  than  wa  eay  X and  Y 
hova  an  optimal  mixing  rata  P ■ Y/X. 

Tha  concapt  of  optimal  mixing  la  uaaful  in  aatabliahing  aynargy. 
Suppoaa  an  optimal  mixing  rat*  axiata.  Than,  if  Xq  and  Yq  ara  any  two 

dot*  of  X and  Y,  w*  hava  max  P(A)  - max  P(AX_,  (1-A)  YQ)  occur*  at  tha 

A A 

in t enaction  of  tha  two  line*; 

(1)  X - AXq,  Y - (1-A)  Yq, 

(2)  Y - pX. 

Solving  for  A,  we  obtain 

» - V(bX0  + V- 

or  aquivalantly, 

* - Vo/<pXo  + V- 
V - oVo/(plto  + V- 

It  ia  to  ba  noticad  that  optimal  mixing  ia  dafinad  in  tarma  of  tha 
parameter  p and  not  in  tarma  of  A.  We  mention  thla  eo  aa  to  avoid  con- 
fuaion  in  picking  combination*  of  doaaa  which  are  not  on  tha  line  of 
optimal  mixing.  For  inatanca,  auppoaa  optimal  mixing  occur*  in  a 111 
ratio.  Then,  p ■ 1 and  the  J.ine  of  optimal  mixing  ia  Y - pX  - X. 

Now,  uuppoae  we  chooaa  doaaa  Xq,  Yq  where  Xq  > Yq.  Then  in  Figure  5, 
we  have 
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The  maximum  of  P(X,Y)  along  X « XXQ,  Y - (1-A)  YQ  occurs  at  tha 

intersection  of  X ■ AXq,  Y ■ (1-X)  Yq  and  Y * X.  It  doaa  notoccur 

whan  X ■ 1/2.  Keeping  this  In  mind,  selection  of  combination  dosaa 
becomes  a more  rational  procedure. 

A.  DESIGN  AND  ANALYSIS.  Having  defined  synergy , we  nov  proceed 
to  give  certain  methods  useful  in  showing  synergism  if  it  exists. 

The  simplest  design  is  the  three  point  design.  For  a three  point 
design,  one  chooses  doses  Xq  of  A and  Yq  of  B and  a combination 

(AXq,  (1-X)  Yq)  of  A and  B.  Synergism  is  then  said  to  exist  if  one  can 

show  I 

P(A)  - P(XXQ,  (1-X)  Yq)  > max(P(X0,0),  (O.Yq)} 

We  propose  to  do  this  by  testing: 

HQ:  P(X)  s max(P(X0,0),  P(0,Yq)) 

against  the  alternative: 

Ht:  P(A)  > max(P(XQ,0),  P(0,Yq)). 

The  test  statistics  used  will  be  the  simple  large  sample  normal 
test  for  differences  between  two  binomial  proportions.  However,  the 
critical  region  used  will  be  of  the  form: 


where  P(Xq,0),  P(0,Yq)  and  P(X)  are  the  observed  proportions  of  indiv- 
iduals responding  at  doses  Xq  and  Yq  and  combination  (AXq,  (1-X)  Yq) , 

respectively,  with  Q(Xq,0),  Q(0,Yq)  and  Q(X)  being  the  respective  pro- 

2 

portions  not  responding.  Letting  a * .05  we  obtain  2.  *■  Z Q •>  .760. 

« l-“Ct  • / o 

Letting  a - ,01,  we  have  Z,  * Z * 1.285. 

1-a  . 90 
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No  tic*  that  in  the  above,  no  aaaunption  ia  made  about  the  equivalence 
of  Xq  and  Yg.  Thla  ia  not  aaauaad  because  it  ia  not  necessary  to  choose 

equivalent  doses  to  establish  synsrgy.  Also,  no  assumption  is  made  about 
X.  Again  this  ia  dona  because  no  assumption  concerning  X (other  than 
0 s X i 1)  is  necessary.  However,  intuitively,  the  efficiency  of  the 
test  procedure  should  be  greatest  when  P(X)  is  maximum.  Therefore  X should 
be  chosen  such  that  the  combination  lies  on  the  intersection  of  the  line 
connecting  Xq  and  Yq  and  the  line  of  optimal  mixing  as  given  in  section 

3 of  this  paper. 

The  Tables  1-IV  present  minimum  sample  sixes  needed  to  detect  sy- 
nergy for  various  values  of  ?x  ■ P(X,0)  • P(0,Y)  - Py  and  P(X)  ■ P^  > P^. 

The  four  tables  give  required  sample  sizes  for  significance  levels  .05 
and  .01  and  power  .80  and  .90. 

If  we  define  Z1_fl  and  Z^_g  as  tha  (l-a)-th  and  (l-B)-th  percentage 
points  of  the  normal  (0,1)  distribution  respectively  and  if  we  let 
ffX  " ’*^5-  and  ox  ■ /p~  (1-P^  j than  the  formula  for  determining  N, 
the  total  sample  else,  is  given  by: 

» ■ °X  + <Zl-»  + Zl-8>2/<PA  - PX)2‘ 

2 2 
where  a is  the  significance  level  of  the  test  and  (1-8)  is  tha  power 

of  tha  test. 

To  determine  N^,  Ny  and  for  a given  N allocation  is  carried  out 
by: 

“a  - ’x  + V' 

and 

*r  ■ "x  " k <*  - V' 

Integer  values  for  N,  N^,  Ny  and  N^  were  determined  by  rounding  off 

+ Nx  - N. 

5 . SUMMARY . Beginning  with  an  intuitively  appealing  definition  of 
synergy  given  by  Hewlett  [1969],  we  have  attempted  in  this  paper  some 
exploration  of  the  implications  of  this  definition,  tried  to  dispel  cer- 
tain naive  notions  concerning  the  analytic  characterization  of  synergy 
and  concerning  the  optimal  mixing  of  drugs.  Too,  we  have  suggested  a 
testing  procedure  to  determine  the  existence  of  synergy  and  have  given 
sample  sizes  required  to  detect  it. 

The  techniques  discussed  in  this  paper  are  illustrated  in  the  follow- 
ing example. 


tha  values  determined  by  the  formulae  so  that  + Ny 
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Suppose  vt  wish  to  d«t«ct  synergy  in  a mixture  of  drugs  A and  B. 
Further  suppose  we  know  1 unit  of  A la  approximately  equivalent  to  3 
unite  of  B and  that  A and  B have  an  optimal  mixing  rate  of  1 part  A to 
2 parte  B.  Nov,  denoting  A as  X and  B as  Y ve  have  Xq  ■ 1.0,  Yq  ■ 3.0 

and  Y ■ pX  ■ 2X.  To  derive  the  best  combination  of  A and  B ve  find 
X - X0Y0/(pXq  + YQ)  - .60  unite  of  A, 
and 

Y - pXqY Q/(pX0  + Yq>  - 1.20  unite  of  B. 

Nov,  suppose  Xq  • 1 and  Yg  “ 3 are  approximately  HD(.50)'s  of  A 

and  B and  it  Is  suspected  that  the  combination  (.60,  1.20)  gives  an 

2 

expected  cure  rata  of  .70.  Then,  for  an  o - .05  level  test  with  power 
.80  ve  find  N ■ 144  when  ■ ?Y  ■ .50  and  ■ .70.  We  find  , Ny 

end  Ng  by  the  follovlng: 

“j  - °x  + V 

- (144)  (/mUV)/(S2  x /UsYUs?  + /T7TT3T) 

- 56.62. 

Ny  - Nx  - |(N  - N^)  - |(144  - 56.62) 

- 43.68. 

Hence,  we  take  56  experimental  units  for  the  combination  (.60,  1.20) 
and  44  each  for  the  individual  applications  of  A (1  unit)  and  B (3  units). 
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Minimum  Sample  Size  for  Detecting  Synergy 


Table  I 


Table  II 

Significance  Level  ,05  Power  .90 


X 

.4 

.5 

.6 

.7 

.8 

.9 

.3 

751 

192 

84 

45 

26 

15 

.4 

0 

823 

204 

86 

44 

23 

.5 

0 

0 

830 

198 

81 

37 

.6 

0 

0 

0 

768 

174 

66 

.7 

0 

0 

0 

0 

637 

132 

.8 

0 

0 

0 

0 

0 

435 
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Minimum  Sample  Size  for  Detecting  Synergy 


Table  III 


Significance  Level  .01  Power  .80 


.4  .5  .6  .7  .8  .9 


P »P 
rX 
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ABSTRACT.  The  problea  of  selecting  the  beat  out  of  several  treat- 
ments' with  dichotomous  responses  is  considered  in  the  framework  of  the 
Bechhofer  sequential  selection  model  with  emphasis  on  minimising  the 
number  of  patients  assigned  to  the  inferior  treatments.  Adaptive  sampling 
rules  are  proposed  for  the  situations  where  the  response  to  the  treatments 
is  delayed  or  where  several  patients  have  to  be  scheduled  at  each  stage. 
Protocols  which  employ  the  new  sampling  rules  with  various  termination 
rules  considered  in  the  literature  are  shown  to  be  superior  or  comparable 
to  those  which  employ  the  familiar  Vector-at-a-Time  or  Play-the-Vinner 
sampling  rule  in  terns  of  the  average  sample  number  and  the  inferior 
treatment  number. 

1.  INTRODUCTION  AND  DEFINITION  OP  SAMPLING  RULES.  Ut  Bj.Ij,  ...  ,1^ 

be  k (k  > 2)  binomial  populations  with  respective  unknown  probabilities  of 
success  p^Pjf'iPfc  where  Pj  > pt  for  i ■ 2,3,...,k.  The  problem  of 

identifying  the  population  with  the  largest  probability  of  success,  the 
'best'  population,  has  been  extensively  studied  in  the  literature.  In 
this  paper  we  are  mainly  concerned  with  the  sequential  selection  model 
for  this  problem  as  formulated  by  Bechhofer  (1958)  and  Bechhofer,  Kiefer 
and  Sobol  (1968),  and  adopted  by  Sobel  and  Veiss  (1970)  to  the  problem  of 
clinical  trials  where  several  treatments  with  dichotomous  responses  are 
being  compared. 

The  Bechhofer  model  assimies  sequential  sampling,  and  consists  of  a 
sampling  rule  which  specifies  the  population  to  be  sampled  at  any  given 
stage  and  a termination  rule  which  directs  when  to  stop  sampling  and  how  to 
make  the  final  choice  of  the  best  population.  The  selection  is  to  be  made 
subject  to  the  P*,A*  -admissibility  requirmaent  on  the  probability  of 
correct  selection  (CS)  that 

P(CS)  > P*  for  Pj-maxtpj.Pj,...^)  > A*  (1) 

where  P*  ( j*  < P*  < 1)  and  A*  (0  < A*<  1)  are  prespecified  constants. 


337 


Preceding  pip  Mmk 


In  the  context  of  clinical  trials  the  Bechhofer  model  provides 
admissible  protocols  which  assign  patients  to  the  treatments  sequentially 
in  time,  one  or  more  at  each  stage,  until  the  best  treatment  is  identi- 
fied with  a specified  probability.  A*  can  be  interpreted  as  the  medi- 
cally significant  or  detectible  difference.  For  specified  P*  and  A*, 
choice  among  the  various  possible  admissible  protocols  is  usually  made 
on  the  basis  of  the  (random)  nuaber  N.  of  patients  assigned  to  treatment 

i (i  ■ l,2,...,k)  and  the  total  number  N of  patients  needed  to  reach  a 
decision.  More  specifically,  Sobel  and  Weiss  (1970,  1972)  base  their 
comparisons  on  the  loss  functions 

k k 

E(N)  - lE(N),  l E(N.)  (2) 

i-1  1 i«2  1 

and  the  risk 

l 

^ (P1-Pi)H(Ni) 

v-»2 

the  last  two  measures  being  given  more  importance  for  obvious  ethical 
reasons. 

It  is  convenient  at  this  point  to  specialize  our  discussion  to  the 
case  when  k * 2;  a major  portion  of  this  paper  as  well  as  most  of  the  past 
work  in  this  area  is  confined  to  the  comparison  of  two  treatments.  The 
admissibility  condition  (1)  now  reads 

P(CS)  > P*  for  A - px-p2  > A*,  (3) 

and  the  loss  functions  of  interest,  given  in  (2),  become  E(N),  known  as  the 
Average  Sample  Number  (ASN),  and  E(N2),  the  Inferior  Treatment  Number  (ITN) . 

Most  of  the  protocols  considered  so  far  in  the  literature  fall  into 
two  broad  classes  depending  on  the  sampling  rule  employed.  The  older  and 
more  familiar  sampling  rule  is  the  so-called  Vector-at-a-Time  (VT)  rule 
which  assigns  patients  to  both  of  the  two  treatments  at  each  stage,  one  to 
each  treatment  randomly,  until  a selection  is  made  based  on  the  termination 
rule.  An  essentially  equivalent  way  of  implementing  the  VT  rule  is  to 
assign  the  first  patient  to  one  of  the  two  treatments  at  random  and  then  to 
alternate  the  treatments  given  to  the  subsequent  patients  as  they  arrive. 

It  is  readily  seen  that  in  any  protocol  which  employs  the  VT  rule,  regard- 
less of  the  termination  rule  used,  we  have  E(N^)  = E(N2)  = E(N)/2. 

Since  one  of  the  basic  aims  of  a clinical  trial  is  to  reduce  the  ITN 
it  was  suggested  by  Zelen  (1969)  that  sampling  be  done  according  to  the  so- 
called  Play-the-Winner  (PW)  rule  instead  of  the  VT  rule.  The  PW  rule  was 


originally  studied  by  Robbins  (1956)  as  a data-dependeat  policy  for  the 
two-armed  bandit  problem.  According  to  this  rule  the  first  patient  to 
arrive  is  given  one  of  the  two  treatments  chosen  at  randon.  The  ith 
patient  (i  * 2 ,3, -,..-)  is  given  treatment  1 (treatment  2)  if  the  (T-l)th 
patient  received  treatment  1 (treatment  2)  and  it  succeeded  or  if  the 
(i-l)th  patient  received  treatment  2 (treatment  1)  and  it  resulted  in  a 
failure.  Zelen  investigated  the  performance  of  the  PW  sampling  rule  in 
the  Anscombe- Colton  model  (Ans combe,  1963;  Colton,  1963)  for  clinical 
trials  and  shoved  that  in  general  it  leads  to  a significant  reduction  in 
the  number  of  patients  who  receive  the  Inferior  treatment. 

Subsequently  Sobol  and  Weiss  (1970)  and  several  others  (See  Hoel, 

Sobel  and  Weiss,  1975  for  an  excellent  review)  have  shown  that  the  PW  rule 
is  superior  to  the  VT  rule  in  the  Bechhofer  model  in  terms  of  reducing 
both  the  ASN  and  ITN  for  fixed  F*  and  A*.  Most  of  the  emphasis  here  has 
been  on  devising  different  termination  rules  and  comparing  the  resulting 
protocols  with  the  already  existing  ones. 

Despite  its  poor  performance  in  terms  of  the  ASN  and  the  ITN,  the  VT 
sampling  rule  has  some  advantages  in  its  implementation  which  are  not 
shared  by  the  PW  rule.  For  example,  in  the  PW  rule,  the  allocation  of  any 
given  patient  to  a treatment  depends  on  the  outcome  of  the  preceding  trial, 
and  hence  it  is  required  that  the  response  to  the  treatments  be  instanta- 
neous or  that  the  response  be  available  by  the  time  a new  patient  arrives; 
the  VT  rule,  on  the  other  hand,  is  applicable  in  situations  of  delayed 
response,  and  allows  for  the  treatment  of  several  patients  at  each  stage. 

One  of  the  purposes  of  the  present  paper  is  to  propose  and  study 
some  sampling  rules  which  are  applicable  in  situations  of  delayed  response. 
The  simplest  case  here  is  when  patients  arrive  twice  as  fast  as  the 
response  to  any  one  of  the  two  treatments  is  made  available.  This  is 
considered  in  Section  2.  The  PI ay- the-Cl ear- Winner  (PCW)  sampling  rule 
introduced  to  handle  this  case  is  defined  as  follows:  At  the  first  stage, 

the  first  two  patients  to  arrive  receive  treatments  1 and  2 respectively. 

At  any  given  stage  assignment  of  treatments  is  made  either  for  two 
patients  or  for  one  patient  depending  on  the  outcome  of  the  preceding 
stage.  At  the  ith  stage  (i  * 2,3,...)  treatments  1 and  2 are  assigned 
randomly  to  two  patients  if,  at  the  (i-l)th  stage,  either  (a)  treatments  1 
and  2 were  assigned  to  two  patients  and  they  both  resulted  in  a success 
or  a failure  or  (b)  treatment  1 or  2 was  assigned  to  one  patient  and  it 

resulted  in  a failure.  At  the  ith  stage  (i  = 2,3, ) treatment  1 (2) 

is  assigned  to  one  patient  if,  at  the  (i-l)th  stage,  either  (a)  treatments 
1 and  2 were  assigned  to  two  patients  and  treatment  1 (2)  resulted  in  a 
success  and  treatment  2 (1)  resulted  in  a failure,  or  (b)  treatment  1 (2) 
was  assigned  to  one  patient  and  it  resulted  in  a success. 

It  can  be  easily  verified  that  the  PCW  sampling  rule  is  equivalent  to 
the  following  rule:  the  first  two  patients  to  arrive  receive  treatments  1 

and  2 randomly.  The  ith  patient  (i  = 3,4,...)  to  arrive  is  given  treatment 
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1 (2)  if  the  (i-2)th  patient  either  (a)received  treatment  1 (2)  and  it 
resulted  in  a success  or  (b)  received  treataent  2 (1)  and  it  resulted  in 
a failure.  This  formulation  implies  that  the  PCW  rule  is  equivalent  to 
implementing  two  PW  rules  in  parallel,  one  starting  with  treatment  1 and 
the  other  with  treataent  2,  a possible  solution  to  the  delayed  response 
case  suggested  by  Zelen  (1969).  This  formulation  also  shows  that  the 
PCW  rule  is  applicable  in  situations  where  the  response  to  the  treatments 
is  instantaneous  but  two  patients  are  to  be  scheduled  to  receive  treat- 
ments at  each  stage. 

The  performance  of  protocols  which  employ  the  PCW  sampling  rule  and 
various  termination  rules  considered  in  the  literature  in  connection  with 
the  PW  rule  is  sumaarized  in  Section  2.  Comparisons  with  the  corres- 
ponding protocols  which  use  the  PW  and  the  VT  sampling  rules  are  also 
presented.  It  is  shown  that  the  PCW  rule  is  in  general  superior  to  the 
other  two  rules  in  the  sense  that  it  requires  comparable  or  smaller  ASN 
and  ITN  to  reach  a decision  in  addition  to  its  greater  generality  over 
the  PW  rule.  Numerical  results  on  the  comparisons  are  presented  only  for 
P*  « 0.9S  and  A*  ■ 0.2. 

The  formulation  of  the  PCW  rule  as  two  PW  rules  in  parallel  allows 
us  to  extend  it  to  situations  where  m patients  are  to  be  scheduled  at 
each  stage  or  patients  arrive  m times  as  fast  as  the  response  to  any  one 
of  the  two  treatments  is  made  available.  This  is  accomplished  by  simply 
implementing  m PW  rules  in  parallel,[m/2]  starting  with  one  of  the  two 
treatments  chosen  at  random  and  the  remaining  starting  with  the  other 
treatment.  This  method  of  dealing  with  the  delayed-response  situations 
was  again  essentially  suggested  by  Zelen  (1969).  Section  3 deals  with 
this  rule  (denoted  PWP  for  Play- the-Winner-in-Paral lei)  for  m * 3.  In 
contrast  to  Section  2 only  a very  limited  ntmber  of  termination  rules  are 
considered  here.  Comparisons  in  terms  of  ASN  and  ITN  indicate  that  the 
behavior  of  the  PWP  rule  is  similar  to  that  of  the  PCW  rule  discussed  in 
Section  2. 

In  Section  4 we  return  to  the  problem  of  selecting  the  best  out  of 
k (k  > 3)  binomial  populations.  The  generalization  of  the  VT  sampling  rule 
to  three  or  more  populations  is  straightforward.  All  of  the  k populations 
are  sampled  at  each  stage.  Equivalently,  the  populations  are  randomly 
ordered  at  the  outset  and  are  sampled,  one  at  each  stage  according  to  this 
order,  sampling  returning  to  the  first  population  at  the  end  cf  a cycle. 

A generalization  of  the  PW  rule,  called  the  Play-the-Winner-Cyclical  (PNC) 
sampling  rule,  appropriate  for  the  present  case  was  studied  by  Sobel  and 
Weiss  (1972) . According  to  the  PWC  rule,  the  k populations  are  randomly 
ordered  at  the  outset.  Sampling  starts  with  the  first  population.  At  the 
ith  stage  (i  = 2,3,...)  the  tth  population  (t  = l,2,...,k)  is  sampled  if, 
at  the  (i-l)th  stage,  either  (a)  the  tth  population  was  sampled  and  it 
resulted  in  a success  or  (b)  the  (t-1) th  population  (0th  population  being 
identified  with  the  kth)  was  sampled  and  it  resulted  in  a failure.  Admissi- 
ble protocols  involving  the  VT  and  the  PWC  sampling  rules  and  the  so-called 
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inverse  stopping  rule  were  compared  by  Sobel  and  Weiss  (1972)  using  the 
loss  functions  defined  earlier  in  this  section.  They  showed  that  the  PWC 
rule  was  uniformly  better  than  the  VT  rule  for  this  stopping  rule.  Except 
for  their  work  nothing  is  at  present  known  about  the  behavior  of  the  VT 
or  the  PWC  sampling  rule  for  other  termination  rules. 

A natural  generalization  of  the  PCW  rule  to  k populations  is  as 
follows:  Sample  all  k populations  at  the  first  stage.  At  the  ith  stage 

(i  = 2,3,...)  sample  only  those  populations  which  were  sampled  at  the 
(i-l)th  stage  and  resulted  in  a success.  If  no  such  population  exists  at 
the  ith  stage,  then  sample  all  the  k populations  again  and  continue  the 
process.  We  shall  refer  to  this  sampling  rule  also  as  the  PCW  rule,  and 
note  that  it  is  also  applicable  in  situations  where  patients  arrive  twice 
as  fast  as  the  response  to  the  treatments  becomes  available.  In  Section  4 
we  present  some  numerical  results  for  the  PCW  rule  for  k = 3 with  the 
inverse  termination  rule  and  some  of  its  modifications  applicable  only  to 
the  VT  and  the  PCW  rules.  It  is  shown  that  with  inverse  termination  the 
PCW  and  the  PWC  rules  behave  more  or  less  identically  while  the  modified 
rules  lead  to  improved  protocols  when  employed  with  the  VT  or  the  PCW  rules. 

Throughout  this  paper  numerical  comparisons  of  the  protocols  are  given 
only  for  P*  * 0.95,  A*  * 0.2  and  a limited  number  of  values  of  the  para- 
meters Pi»P2» * • *»Pv*  More  extensive  comparisons  as  well  as  the  analytical 

results  pertaining  to  the  protocols  will  be  presented  elsewhere. 

2.  THE  PCW  SAMPLING  RULE  FOR  TWO  BINOMIAL  POPULATIONS.  In  this 
section  we  consider  several  termination  rules  proposed  in  the  literature  in 
connection  with  the  PW  sampling  rule.  The  values  of  ASN  and  ITN  are  pre- 
sented for  admissible  protocols  (P*  = 0.95,  A*  » 0.2)  which  employ  these 
termination  rules  and  the  VT,  PW  and  PCW  sampling  rules  for  A = (p.-p2)/2 
=0.2  and  p^  = (p^+P2)/2  = 0(0. 1)0.9.  The  sample  sizes  correspon- 
ding to  other  values  of  these  parameters  are  available  but  are  not  given 
here  since  the  comparisons  presented  here  reflect  the  general  performance 
of  the  protocols  quite  adequately.  Protocols  are  identified  throughout  by 
the  sampling  rule  and  the  termination  rule  employed.  For  example,  PCW3 
refers  to  the  protocol  which  uses  the  PCW  sampling  rule  and  Termination 
Rule  3.  Symbols  such  as  P(CS|PCW3),  E(N2|VT4)  and  E(N|PW1)  have  their 
obvious  meanings.  For  i = 1,2,  the  cumulative  number  of  successes  and 
failures  on  IL,  at  any  given  stage  will  be  denoted  by  and  F^  respectively. 

Termination  Rule  1 (Sobel  and  Weiss,  1970).  Sampling  stops  as  soon  as 
i Si~S2 | = r,  where  r is  chosen  so  as  to  make  the  resulting  protocol  admissi- 
ble. The  population  with  the  larger  number  of  successes  is  chosen  as  the 
better;  in  case  Sj^  = S2,  the  better  population  is  chosen  at  random. 

For  given  P*  and  A*,  the  minimum  values  or  r which  make  the  protocols 
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VT1  and  PW1  admissible  have  bean  determined  by  Sobel  and  Weiss  (1970) . 

This  can  ba  done  for  PCW1  using  a similar  method.  For  P*  ■ 0.95  and 
A*  ■ 0.2,  these  are  given  by  r ■ 4 for  VT1,  r ■ 10  for  PW1  and  r » 8 for 
PCW1.  Exact  expressions  for  the  ASN  and  ITN  of  VT1  and  PW1  are  also 
given  by  Sobel  and  Weiss  (1970).  Similar  expressions  can  be  obtained  for 
PCW1. 

Termination  Rule  2 (Sobel  and  Weiss,  1971) . Sampling  stops  as  soon 
as  either  8j  or  S2  (or  both)  equals  r where  r is  preassigned  to  make  the 

protocols  admissible.  The  population  which  achieves  r successes  first  is 
declared  the  better.  If  both  achieve  r successes  simultaneously,  then  the 
better  population  is  selected  at  random. 

It  can  be  shown  that,  for  all  Pj,  p2»  P(CS|VT2)  - P(CS|PW2)  ■ 

P(CS|PCW2).  Hence  the  same  value  of  r would  make  all  these  three  protocols 
admissible;  r equals  20  for  P*  ■ 0.95  and  A*  ■ 0.2.  Sobel  and  Weiss  (1971) 
have  shown  that  E(N|PW2)  < E(N|VT2)  and  E(N2|PW2)  < E(N2|VT2)  uniformly  in 

Pj,  and  p2.  These  inequalities  can  be  shown  to  hold  with  PW2  replaced  by 

PCW2. 

The  following  termination  rule  is  a modification  of  Termination  Rule  2, 
and  is  applicable  to  the  PCW  and  the  VT  sampling  rules  but  not  to  the  PW 
rule.  It  is  defined  in  terms  of  the  cumulative  number  of  'clear  successes', 

S®  on  (i  ■ 1,2),  defined  by  sj  ■ - (the  number  of  times  IIj  and  n2 

were  sampled  together  and  they  both  succeeded). 

c c 

Termination  Rule  3.  Sampling  stops  as  soon  as  either  Sj  or  S2  (or 

both)  equal  r.  The  population  with  the  larger  total  number  of  successes  is 
chosen  as  the  better.  If  S.  » S.,  then  the  better  population  is  chosen  at 
random. 

For  P*  ■ 0.95  and  A*  ■ 0.2,  the  r value  which  makes  the  protocol 
admissible  equals  12  for  PCW3  and  9 for  VT3. 

The  next  termination  rule,  originally  studied  by  Hoel  (1972)  for  the 
PW  sampling  rule,  is  based  on  the  statistics  Rj  ■ Sj  + P2  and  R2  " S2  * Pl' 

Termination  Rule  4 . Sampling  stops  as  soon  as  either  R^  or  R2  reaches 
a preassigned  value  r,  and  the  population  Jl^  is  selected  as  the  better  if 
R^  reaches  r first  for  i ■ 1,2.  With  the  PCW  and  the  VT  sampling  rules, 
r ♦ 1 may  be  reached  before  stopping.  If  both  R^  and  R2  reach  simultaneous- 
ly, as  is  possible  with  the  PCW  and  the  VT  rules,  the  better  population  is 
selected  at  random. 


It  can  be  shown  that  P(CS|PCW4)  . P(CS|PWfj . Hence,  a»  in  the  case  of 
Termination  Rule  2,  the  same  value  of  r would  make  both  of  these  protocols 
admissible.  For  P*  ■ 0.95  and  A*  ■ 0.2,  the  minimum  value  of  r equals 
33  for  PCW4  and  PW4,  and  29  for  YT4. 

Termination  Rule  5 (Pushimi , 1973) . Sampling  stops  as  soon  as  either 
I S.-S-l  ■ r or  F.  + F,  “ 8*  Th*  population  with  the  larger  maber  of 
successes  is  chosen  as  the  better,  and  in  case  S.  • S-,  the  better  popula- 
tion is  chosen  at  random.  1 1 

For  any  given  P*  and  A*  there  are  In  general  several  values  of  the 
pair  (r,s)  which  would  make  the  protocols  VTS,  PW5  and  PCW5  admissible. 
Pushimi  (1973)  shows  how  the  'best*  pair  can  be  obtained  for  PW5  using  the 
property  that,  as  s tends  to  •»,  the  present  termination  rule  reduces  to 
Termination  Rule  1 and,  as  r tends  to  •»,  it  reduces  to  Termination  Rule  2. 
The  'best*  choice  of  (r,s)  corresponding  to  PCW5  can  also  be  determined 
along  the  same  lines. 

Termination  Rule  6 (Nordbrock,  1975).  Sampling  stops  as  soon  as  either 

I A * S A Sj 

Is.  - SJ  ■ r or  |p.  - pJ  > ■ ■— — ■ where  p,  ; the  population 

(J»l+F2)  1 (Sl*Fi) 

with  the  larger  number  of  successes  is  chosen  as  the  better,  and  in  case 
S1  ■ S2,  the  better  population  is  chosen  at  random. 

The  remarks  made  in  connection  with  Termination  Rule  5 regarding  the 
choice  of  (r,s)  apply  here  as  well.  (r,s)  equals  (8,4.2)  for  PCW6,  (11,4.2) 
for  PW6  and  (4,3.8)  for  VT6  when  P*  ■ 0.95  and  A*  ■ 0.2. 

Table  1 summarizes  our  results  on  the  ASN  and  the  ITN  of  the  protocols 
introduced  above  for  P*  ■ 0.95,  A*  » A ■ 0.2  and  p0  ■ 0.1(0. 1)0.9.  As 

mentioned  earlier,  the  overall  behavior  of  the  protocols  is  adequately 
reflected  by  the  results  of  this  table.  Zt  can  be  seen  that,  except  for  a 
few  exceptions  (for  example,  for  values  of  pQ  very  close  to  1),  the  PCW  rule 

requires  comparable  or  smaller  sample  sizes  when  compared  to  the  VT  or  the 
PW  rule.  The  increased  generality  of  the  VT  sampling  rule  over  the  PCW  rule, 
and  that  of  the  latter  over  the  PW  rule  should  also  be  kept  in  mind  when 
comparing  these  protocols. 

3.  THE  PWP  SAMPLING  RULE  FOR  TWO  BINOMIAL  POPULATIONS.  The  PWP 
sampling  rule  is  considered  Here  for  Termination  kules  2 and  5 of  the  previ- 
ous section.  For  P*  ■ 0.95  and  A*  ■ 0.2,  r ■ 20  for  PWP2,  and  (r,s)  * (8,41) 
for  PWP5.  Table  2 gives  the  sample  sizes  for  these  two  protocols  corres- 
ponding to  the  same  values  of  the  parameters  as  in  Table  1.  It  can  be  seen 
that  the  behavior  of  the  PWP  sampling  rule  is  quite  similar  to  that  of  the 
PCW  rule. 
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4.  THE  PCW  SAMPLING  RULE  FOR  THREE  BINOMIAL  POPULATIONS.  The  PCW 
sampling  rule  for  three  binomial  populations  is  considered  here  with 
Termination  Rule  2 defined  in  Section  2,  and  two  of  its  modifications 
applicable  only  to  the  PCW  and  the  VT  sampling  rules.  The  protocol 
PWC2  has  been  studied  by  Sobel  and  Weiss  (1972) . Closed  font  expressions 
for  P(CS|PCW2)  and  B(Ni|PCW2),  i ■ 1,2,3,  can  be  obtained  using  the  method 

of  Sobel  and  Weiss  (1972).  Numerical  results  on  the  probabilities  of 
correct  selection  for  various  values  of  the  parameters  indicate  that,  as 
in  the  case  of  two  populations,  P(CS|PCW2)  ■ P (CS | PWC2)  even  though  we 
have  not  been  able  to  establish  this.  For  P*  - 0.95  and  A*  ■ 0.2,  the 
conon  value  of  r which  makes  the  protocols  PCW2  and  PWC2  admissible  is  28. 

The  modifications  of  Termination  Rule  2 which  we  consider  are  quite 
similar  to  Termination  Rule  of  Section  2 in  that  they  are  obtained  by 
defining  'clear  successes'  appropriately.  In  the  first  modification, 
Termination  Rule  3',  we  define  ■ (number  of  times  all  three  populations 

were  sampled  and  either  n^aftd  TI^  or  n^and  n succeeded  and  the  other 

failed)  * (number  of  times  and  Tlj  or  ltj  and  were  sampled  and  IIj 

succeeded  and  the  other  failed)  * 2 (number  of  times  all  three  populations 
were  sailed  and  alone  succeeded),  and  T^  and  T^  symmetrically.  Termi- 
nation Rule  3'  is  then  obtained  from  Termination  Rule  2 by  simply  replacing 
by  T^  for  i ■ 1,2,3..  Similarly,  Termination  Rule  3"  is  obtained  from 

Termination  Rule  2 by  replacing  Sj^  by  UA  for  i « 1,2,3,  where  U1  ■ (number 
of  times  all  three  populations  were  sampled  and  either  IT  and  n or  n 

X ■ 1 

and  n3  succeeded  and  the  other  failed)  ♦ (number  of  times  and  n2  or  JIj 

and  H3  were  sampled  and  they  both  succeeded)  + 2 [(number  of  times  all 

three  populations  were  sampled  and  alone  succeeded)  ♦ (number  of  times 

nx  and  n2  or  and  n3  were  sampled  and  Ilj  alone  succeeded)  ♦ (number  of 

times  alone  was  sampled  and  it  succeeded)],  and  and  are  analagous- 

ly  defined.  The  r values  which  make  the  Termination  Rules  3'  and  3" 
admissible  for  P*  ■ 0.95  and  A*  ■ 0.2  are  respectively  24  and  37. 

Table  3 summarizes  the  expected  sample  sizes  for  the  protocols  of  this 
section  for  selected  values  of  the  parameters.  As  in  the  case  of  Tables 
1 and  2,  more  extensive  comparisons  are  available  but  are  not  presented. 

It  is  clear  from  Table  3 that  PCW3'  is  to  be  preferred  over  the  others. 
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performed  at  the  Temple  University  Computer  Center.  Part  of  the  second 
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PCW2 

PW2 

VT2 

PCW2 

PW2 

VT2 

0.1 

81.0 

80.5 

100.0 

181.0 

180.5 

200.0 

0.2 

52.9 

52.4 

66.7 

119.5 

119.1 

133.4 

0.3 

38.6 

38.1 

50.0 

88.5 

88.0 

100.0 

0.4 

29.6 

29.1 

39.9 

69.4 

69.0 

79.8 

0.5 

23.3 

22.8 

33.2 

56.4 

55.9 

66.4 

0.6 

18.4 

17.8 

28.4 

46.6 

46.0 

56.8 

0.7 

14.1 

13.4 

24.9 

38.8 

38.1 

49.8 

0.8 

9.9 

8.8 

22.2 

31.8 

30.7 

44.4 

0.9 

5.0 

2.5 

20.0 

24.9 

22.4 

40.0 

TABLE  1.  (Continued) 
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po 

B(N2) 

E (N) 

PCW4 

PW4 

VT4 

PCW4 

PW4 

VT4 

0.1 

26.6 

26.5 

24.3 

59.4 

59.0 

48.6 

0.2 

25.9 

25.8 

24.3 

58.6 

58.3 

48.6 

0.3 

25.1 

24.9 

24.3 

57.7 

57.3 

48.6 

0.4 

24.0 

23.8 

24.2 

56.5 

56.2 

48.4 

0.5 

22.6 

22.3 

24.2 

'55.0 

54.7 

48.4 

0.6 

20.7 

20.3 

24.2 

52.9 

52.6 

48.4 

C.7 

17.9 

17.3 

24.3 

50.1 

49.6 

48.6 

0.8 

13.3 

12.4 

24.3 

45.5 

44.8 

48.6 

0.9 

5.0 

2.5 

24.3 

37.0 

35.0 

48?6 

PCW5. 

PWS 

VT5 

PCW5 

PW5 

VT5 

0.1 

20.3 

20.4 

19.7 

45.2 

45.9 

39.4 

0.2 

20.5 

22.0 

19.9 

46.2 

50.1 

39.8 

0.3 

20.0 

22.7 

20.4 

45.6 

52.6 

40.8 

0.4 

18.8 

22.2 

21.2 

43.6 

52.7 

42.4 

0.5 

16.8 

20.3 

22.2 

39.9 

49.8 

44.4 

0.6 

14.0 

16.9 

23.3 

34.4 

43.5 

46.6 

0.7 

10.8 

12.2 

24.4 

28.0 

34.1 

48.8 

0.8 

7.8 

7.1 

24.9 

22.0 

23.9 

49.8 

0.9 

5.0 

2.3 

25.0 

17.0 

14.2 

50.0 

PCW6 

PW6 

VT6 

PCW6 

PW6 

VT6 

0.1 

13.4 

13.5 

14.1 

29.1 

29.8 

28.2 

0.2 

13.7 

13.9 

14.8 

30.1 

31.1 

29.6 

0.3 

14.2 

14.6 

14.4 

31.9 

33.3 

28.9 

0.4 

15.6 

16.2 

16.0 

35.6 

37.9 

32.0 

0.5 

15.7 

17.8 

17.5 

37.1 

43.2 

35.0 

0.6 

13.9 

16.7 

18.7 

34.2 

41.5 

37.3 

0.7 

10.8 

12.0 

19.3 

27.9 

33.7 

38.6 

0.8 

7.7 

7.1 

19.9 

21.8 

23.9 

39.9 

0.9 

4.9 

2.4 

19.9 

16.9 

14.5 

39.8 

t 

\\ 

I 

\ 
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TABLE  2 . EXPECTED  SAMPLE  SIZES  fOR  THE,  PROTTlOI.S  OF  SECTION  3 
FOR  P*  • 0.95  mND  6 ■ 6*  * 0.2 


po 

B(N2|PWP2) 

E{N jPWP2) 

E(N2|PWP5) 

E(N|PWP5) 

0.1 

81.1 

166.4 

20.2 

45.3 

0.2 

53.0 

113.1 

20.6 

46.6 

0.3 

38.7 

85.7 

20.2 

46.3 

0.4 

29.8 

68.5 

19.1 

44.5 

0.5 

23.5 

56.4 

17.3 

41.2 

0.6 

18.7 

47.2 

14.7 

36.2 

0.7 

14.6 

39.8 

11.8 

30.3 

0.8 

10.7 

33.4 

9.0 

24.9 

0.9 

6.9 

27.6 

6.7 

20.5 

TABLE  3.  EXPECTED  SAMPLE  SIZES  FOR  THE  PROTOCOLS  OF  SECTION  4 
FOR  P*  - 0.95  AND  L*  - 0.2 


P1 

VP3 

ECNj) 

E(N2)«ECN3) 

| 

PWC2 

PCW2 

PCW3' 

PCW3" 

PWC2 

PCW2 

PCW3' 

4 

PCW3"  i 

0.2 

0 

140.0 

140.0 

67.3 

95.0 

112.3 

113.0 

54.9 

77.0  \ 

0.3 

0.1 

93.3 

93.3 

51.6 

67.4 

73.0 

73.6 

41.2 

53.5 

0.4 

0.2 

70.0 

70.0 

44.4 

54.0 

52.9 

53.6 

34.4 

41.7 

0.5 

0.3 

55.9 

55.9 

40.7 

45.6 

40.4 

41.1 

30.3 

33.8 

0.6 

0.4 

46.4 

46.4 

38.8 

39.7 

31.5 

32.2 

27.2 

27.9  1 

0.7 

0.5 

39.6 

39.7 

37.9 

35.1 

24.4 

25.2 

24.4 

22.8  I 

0.8 

0.6 

34.6 

34.6 

37.2 

31.2 

18.1 

19.1 

20.8 

17.8  j 

0.9 

0.7 

30.8 

30.8 

35.4 

27.4 

11.3 

12.9 

14.9 

12.2  | 

i 1.0 

0.8 

27.9 

28.0 

29.2 

22.8 

1.7 

5.0 

5.0 

5.Q  | 
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PREDICT1VISM  AND  SAMPLE  REUSE 
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ABSTRACT.  Thia  paper  amphaalsea  the  paramount  importance  of  prediction 
a*  opposed  to  eatimation  and  reviews  a variety  of  general  structures  for 
implementing  the  predictivistic  outlook.  It  also  stresses  in  particular  the 
newly  devised  predictive  sample  reuse  method  as  a highly  flexible  and  versa- 
tile tool  in  low  structure  situations.  An  illustration  is  given  to  a simple 
survival  situation. 


1 1 


1 , INTRODUCTION.  The  fundamental  thesis  of  this  paper  is  that  the 
inferential  emphasis  of  Statistics,  theory  and  concomitant  methodology,  has 
been  misplaced.  By  this  is  meant  that  the  preponderance  of  statistical 
analyses  deals  with  problems  which  involve  inferential  statements  concerning 
parameters.  The  view  proposed  here  is  that  this  stress  should  be  diverted 
to  statements  about  observables.  With  regard  to  parameters  we  take  the 
narrow  view  which  relegates  them  at  moat  to  be  components  of  a statistical 
model  that  are  not  capable  of  being  observed  or  potentially  observed.  This 
is  not  necessarily  to  deny  them  their  utility  in  many  hypothecical  frame- 
works but  there  has  been  a strong  tendency  to  exaggerate  their  importance  in 
statistical  inference.  Even  such  a compelling  "parsimtst"  as  the  speed  of 
light  la  in  some  sense  ostensibly  capable  of  being  measured  (observed)  though 
perhaps  subject  to  error.  In  this  sense  it  is  at  least  a potentially 
observable  entity.  Other  values  which  often  are  miadasignated  as  parameters 
are  those  defined  as  a function  of  a finite  number  of  observables  or  poten- 
tial observables  which  typically  occur  in  sample  survey  situations.  For  ex- 
ample we  may  be  trying  to  "estimate"  the  total  response  of  a specific  finite 
population  by  observing  some  random  portion  of  that  population.  The  unobserved 
responses  are  presumably  potentially  observable  (or  the  randomization  is  mean- 
ingless) and  it  is  maintained  that  we  are  basically  predicting  them  or  some 
function  of  them.  This  is  certainly  within  the  realm  of  prediction  though  it 
is  generally  referred  to  as  estimating  a parameter  of  a finite  population. 

Hence  these  two  previously  mentioned  cases,  measuring  some  physically  mean- 
ingful constant  and  estimating  functions  of  observables  are  within  the  realm 
of  predictlvl8m.  It  is  our  contention  that  in  other  cases  the  Introduction 
of  a convenient  parametric  statistical  model  seems  to  impel  statisticians  to 
reformulate  an  experimenter's  often  imprecisely  framed  question  concerning 
the  data  into  a parametric  analysis  even  when  the  parameters  are  completely 
artificial  constructs.  We  then  proceed  to  foist  upon  the  unwary  client 
"precise"  statements  about  these  too  often  nonexistent  entitles.  This  ten- 
dency Is  reinforced  because  we  have  too  long  been  subjected  Co  solutions  to 
hypothetical  problems  which  invariably  begin  --  "suppose  we  are  interested 
in  the  estimation  of  a parametric  function  BLAH(d)."  This  stress  on  para- 
metric Inference  made  fashionable  by  mathematical  statisticians  has  been  not 

This  work  was  supported  In  part  by  U.J.  Army  tirant  LMIK'OU-Yh-G-Oc 16. 


only  a comfortable  posture  but  also  a secure  buttress  for  the  preservation 
of  the  high  esteem  enjoyed  by  applied  statisticians  because  exposure  by 
actual  obaarvation  in  parametric  estimation  is  rendered  virtually  impossible. 

Of  course  those  who  opt  for  predictive  Inference  i.e.  predicting  obser- 
vables or  potential  observables  are  at  risk  in  that  their  predictions  can  be 
evaluated  to  a large  extent  by  either  further  observation  or  by  a sly  client 
withholding  a random  portion  of  the  data  and  privately  assessing  a statis- 
tician's prediction  procedures  and  perhaps  concurrently  his  reputation. 
Therefore  much  may  be  at  stake  for  those  who  adopt  the  predlctivlstlc  or 
obeervablllstic  or  aparametric  view.  But  its  relevance  is  clear. 

It  was  the  burden  of  a previous  paper  Geisser  (1971)  to  argue  that 
most  problems  currently  cast  in  terms  of  parametric  estimation  and  testing 
could  be  more  Informatively  reformulated  in  a predict ivis tic  mode.  A general 
catalogue  of  such  problems  wae  presented  there  and  the  Bayesian  inferential 
approach  stressed.  In  this  paper  we  shall  discuss  the  problem  of  prediction 
per  se  from  a variety  of  structures  ranging  from  hi»h  to  low  depending  upon 
the  amount  of  Information  Infused  into  the  model.  In  particular  we  will 
stress  a new  low  structure  approach  termed  predictive  sample  reuse. 

2 . HIGH  STRUCTURE . The  high  structure  approach  to  statistical  prediction 
involves  the  tight  apparatus  of  a prior  distribution  for  the  parameters  invol- 
ving known  hyperparameters  and  a specified  likelihood,  i.e.  a Joint  sampling 
distribution  of  observables,  past  and  future,  as  it  were.  Hence  we  need  assume 

that  (x^,...,^;  • ••  >xhim^  or  *n  * ®°re  co°t>*ct  notation  (x^f  X(M)^ 

has  Joint  distribution  F(x^j  x^|9)  where  9 is  a set  wi'  unknown  para- 
meters. Further,  a prior  distribution  on  9,  say  G(9|t)»  is  also  assumed 
where  the  set  of  hyperparameters  r is  known.  The  posterior  distribution  of 

6 is  then  based  on  the  observed  X^N'  ■ x^, 

G(8|.<tl).T)-1,il’W||j°l8[T)  (2.1) 

F(xw|t) 

where 

F(x^|t)  -J*  F(x^|0)  d G ( 0 1 t)  . ^2,2) 

This  than  nermits  the  calculation  of  the  predictive  distribution  of  X^ 
given  X'  ' and  T,  resulting  in 

p(x(M)lx(N),T)  - / F(x(m)|xM,9)  d G(9|x<N>,t)  (2.3) 


where 
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F(x(m)|x1 


',e)  - 


F(x'**  | 8) 


(2.4) 


The  denominator  of  the  above  being  the  marginal  sampling  distribution  of 

the  observed  random  variables  . Zn  essence,  (2.3)  represents  the 
ultimate  in  statistical  prediction  and  everything  else  is  a summary  of 
one  kind  or  another  of  this  distribution  function.  If  point  prediction 
is  of  interest  then  one  might  choose  as  a point  predictor  the  predictive 
expectation  of  (2.3) 

E(X(M)ix<N)  - x(H),  t)  (2.5) 

or  the  median  or  the  mode  of  (2.3)  or  whatever  ensues  from  a particular 
loss  function. 


Often  in  this  approach  there  is  a necessary  relaxation  of  the 
assumption  that  r is  known.  This  is  generally  handled  in  one  of  two 
waya.  First  it  is  often  the  case  that  little  loss  in  terms  of  inco- 
herence is  engendered  by  assuming  an  improper  prior  for  the  hyperpara- 
meter  T.  Hence  a new  predictive  distribution  is  obtained  by 
calculating 

P(x(m)|x<N>)  -/  P(X(M)I*^*T)  d G (t)  (2.6) 

A second  approach,  usually  associated  with  empirical  Baves  procedures,  is 

to  "estimate"  t from  the  marginal  distribution  F(x^  |t)  given  in 
(2.2)  by  maximum  likelihood  or  the  method  of  moments  or  any  other  conveni- 
ent procedure.  This  then  results  in  an  approximate  predictive  distribution 

P(x(m)|x<N>,<0  and  a point  predictor,  say,  e(x^mj  |x^ , tt). 


Historically  there  have  also  been  two  other  high  structure  approaches. 
The  first  by  Fisher  (1956)  was  termed  fiducial  inference  and  the  second 
Fraser  (1968)  termed  structural  inference.  These  generally  require  for 
their  implementation,  a much  more  restrictive  sampling  distribution  and  an 
assumption  of  complete  Ignorance  concerning  9 which  in  turn  implies  the 
absence  of  t.  Here  one  would  calculate  the  fiducial  or  structural  distri- 


but ion 


cp(0|x^) 


and  then  compute  the  predictive  distribution  of 


X(M), 


VX(M)|x(N))  "J*  F(x(M)ly^’9^  d V (8|x(N)>‘ 


(2.7) 


This  type  approach  is  at  most  valid  only  under  stringent  assumptions. 
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Many  statisticians  have  questioned  it>  validity  entirely.  Recently  Barnard 
(1975)  has  developed  a pivotal  approach  to  parametric  Inference.  Hie 
approach,  aa  demonstrated  by  Hlnklay  (1975) • can  easily  be  adapted  to  a 
predictivlstic  mode  by  finding  predictive  pivots.  It  appears  also  to  be 
capable  of  incorporating  certain  types  of  prior  Information. 

3 , INTERMEDIATE  STRUCTURE . The  classical  (Neyman-Pearson)  approach 

only  assumes  (X^}X(M))  -*P(x^N\x^|0),  i.e.  a sampling  distribution 

and  enough  structure  on  the  distribution  so  that  one  can  compute,  independent 
of  0, 


Pr  [x(M)  6 A(XW)]  . p. 

(lO  (v) 

This  of  course  la  not  a probability  statement  for  X'  ' m x'  , aa  in  the 
Bayea  approach.  Here  p represents  the  degree  of  confidence  that 

X(m)  f A (x^),  p being  a valid  probability  in  the  sense  of  the  long-term 

frequency  of  repetitions  from  the  Joint  set  of  random  variables  (x'N';X/M\). 
In  other  words,  p is  the  proportion  of  times  in  the  long  run  that  ' ' 

X/u\  € A (x^)  and  is  interpreted  as  the  confidence  one  has  in 

(M;  ' 

X(m)  € A (x'  ;)  once  X'  ' ■ xv  ' has  been  observed.  This  is  usually 

referred  to  as  a tolerance  Interval  in  the  statistical  literature.  For 
example,  if  we  are  dealing  with  the  problem  of  predicting  the  N + 1 obser- 
vation Xjj  + j from  the  first  N observations,  X^^ X^  and  assume  that 

(Xt}  1 m 1 N + 1 are  ild  N(0,l)  then  one  notes  that  for 

_1  N 

*N  - " Si  h 


Wi 


-1 

~N(0,  1+N  ). 


From  (3.1)  we  obtain 

r xn4.i-^  1 

^ s * - 


(3.1) 


PrL 

l+N 

■ I (b)  - i (a)  ■ p, 


k 


Pr  LXN+avl+H  * s;  X, 


N+l  * Vb  1+N 


(3.2) 


where  $ (y)  is  the  standard  normal  distribution  function. 


While  (3.2)  is  a probability  statement,  once  we  observe  X^  - x^  and 

calculate  the  limits,  this  now  becomes  a confidence  statement  and  has  only 
the  restricted  interpretation  discussed  before. 


A point  predictor  is  usually  obtained  by  Inserting  in 


E(X(^|X(l,ix(N),0) 


an  estimate  9 (*'"')  for  8 - tha  axpactation  being  taken  over  the  condi- 
tional sampling  diatribution. 

Another  approach,  having  ita  roota  in  Fisher's  work  (1956) , tented 
redlctive  likelihood,  haa  recently  been  independently  introduced  by  Hlnkley 


ifVu*WT*w* wth rrmft/L'ifwrrri ' ■ ■ j,  vbij.wj. um. liumm-w 


though  in  an  extend ad  aenae,  play*  the  key  role.  It  la  aaaumed  that 

(X^jX^j)  have  likelihood  L(x^;x^m)|0)  which  admlta  a totally 

sufficient  reduction  of  the  data.  In  the  case  of  independent  and  Identically 
diatributed  random  varlablea  a minimal  aufficlent  reduction  need  only  be 
available.  In  thia  latter  caaa  aa  pointed  out  by  Piaher  (1956),  a minimal 
aufficlent  atatlatlc  la  a function  of  the  individual  aufficlent  atatiatlca 
from  any  portion  of  the  entire  sample.  The  concept  of  a totally  aufficlent 
atatlatlc  Introduced  by  Lauritzen  (1974)  permita  extension  of  thia  reault 
to  the  more  general  caae  of  dependence. 


) and 


- s(X' 


, ) be  the  aet  of  totally 


aufficlent  atatiatlca  for  8 baaed  on  the  random  varlablea  to  be  obaerved 
and  thoae  that  are  to  be  obaerved  and  predicted,  respectively.  Then  one  can 
obtain,  independent  of  9,  the  conditional  probability  function 


(N) -00  » 


(3.3) 


which  is  now  defined  aa  being  proportional  to  the  predictive  likelihood  i.e. 


fNlVM>  “ ',rlk 


(3.^) 


Thia  ia  then  treated  aa  la  the  usual  L x 9 where  now  X/ux  takes  on  the 

(N)  W 

role  of  0.  For  the  fixed  value  x'  , the  predictive  likelihood  orders  the 
plausibility  for  various  values  X^  ■ x^j.  For  a simple  example,  consider 

X^,  i ■ 1 N + M as  Bernoulli  lid  random  variables  where 

P(Xj-l)  ■ l-P(Xj-O)  ■ 8.  If  r out  of  the  first  N are  l'a,  we  can  order 

possible  predictive  values  for  the  number  of  l'a,  say  t,  in  the  next  H 

N M 

trials.  Defining  R a^X.,  T XN  , , which  are  sufficient,  we  can 

j • i ..1  Pi  + X 


compute  in  a simple  fashion 


R+T  - r+tl  - 


0 (?)  , . , 

* prlk  <rl£> 

'r  + t' 


(3.5) 


which  ia  used  to  order  the  plausible  values  for  t«0 H. 


A point  predictor  can  conceptually  be  obtained  by  maximising  the 
predictive  likelihood.  In  the  caae  where  M > 1 and  the  random  variables 
are  lid.  it  la  clear  that  prlk  (x^)  will  have  multiple  maxima  due  to 

the  exchangeability  of  the  likelihood.  This  must  be  so  and  should  be  no 
cause  for  concern.  In  the  previous  exanple  though,  there  may  be  a unique 
maxima  at  soma  value  of  t and  be  adequate  if  t is  to  be  predicted.  It 
is  clear,  however,  that  if  the  individual  , . . . ,X^+M  are  to  be  pre- 

dicted and  the  maximum  was  at  t ■ tQ,  say,  then  every  partition  of 

x^+j XN+M  *nto  to  ant*  M-t0  O's  would  also  yield  identical 

maxima  of  the  prlk 

For  a variety  of  interesting  applications  of  predictive  likelihood  to 
standard  statistical  situations,  the  reader  is  referred  to  Hinkley  (1975). 

4.  LOW  STRUCTURE  AND  ASSESSMENT.  Before  actually  discussing  techniques 
available  in  low  structure  situations  it  will  be  useful  to  review  a very  old 
and  informal  method  of  considerable  value  in  comparing  point  predictors. 

Suppose  several  predictors  are  suggested  for  a set  of  data,  then  a fruitful 
comparison  of  them  may  be  accomplished  by  a validation  technique.  The  sample 

xW  randomly  divided  into  two  parte  x^N"n^-(x1> . . . ,xN_n)  and 

x^-  (xN_n+l  ’ ‘ * ,xlP  C4H*d  the  construction  sample  and  the  validation 

sample  respectively.  Assume  also  that  associated  with  each  sample  point  x^ 

la  a known  value  s^.  The  data  analyst  then  computes  the  competing  predictors 

from  the  construction  sample  obtaining,  say,  £ ..  (x^"n\z^“n^  ;*,)  ■ 

th  J 

as  the  1 predictor  for  the  value  x^  at  known  value  s^, 

J ■ N-n+1,.,.,  N;  1 ■ 1 K where  K represents  the  number  of  predictors 

to  be  compared,  and  Z^N_n^  ■ (s^  ...,*N_n).  Firat  the  ««lduaia 

“ xj  " rjt  are  computed  and  then  the  empirical  distribution  functions  of 

residuals  are  plotted  for  each  predictor.  A comparison  of  these  empirical 
distribution  functions  will  shed  much  light  in  determining  which  predictor  is 
most  appropriate.  Sometimes  when  the  validation  sample  is  not  very  large  a 
relevant  summary  measure  of  the  predictive  discrepancy  is  adequate  for  compari- 
son. For  example  we  might  compute  the  predictive  mean  squared  error 

-1  N 

a?  “ (N-n)  Tj  r^ . i-l,...,K.  This  procedure  is  generally  useful  only 

1 J -N-n+1  1 

when  a reasonably  large  number  of  observations  is  at  hand.  This  is  often  not 
the  case.  Also  the  procedure  seems  inefficient  in  that  it  does  not  extract  all 
of  the  information  in  the  data.  To  overcome  this  a technique  which  is  referred 
to  as  simple  cross-validation  may  be  substituted. 
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L«t  * ' ” ' - xJ+1,...,xN)  with  corresponding 

2^(N“l)  m (B^, be  the  date  set  with  the  Jtl*  obser- 
vation omitted.  Now  for  each  predictive  function  we  compute  the  predictor 
" *ji  a^)  for  the  omitted  obaervation  x^  and  re- 

peat this  for  J»1,...,N  for  each  predictor  obtaining  r^  ■ 4^  - Xj. 

Similarly  as  in  the  validation  set  up,  we  are  in  a position  to  compare  for 
each  predictor  its  empirical  distribution  function  or  a relevant  summary 
measure  of  predictive  discrepancy.  However  in  the  case  of  simple  cross 
validation  we  have  N residuals  for  each  predictor  Instead  of  n aa  in  the 
validation  ease.  One  caution  is  in  order  — in  the  validation  case  the 
residuals  are  dependent  only  by  virture  of  the  same  predictive  function 
while  in  the  simple  cross-validation  some  further  algebraic  dependence 
creeps  in  as  a result  of  using  the  data  rapetltlvely.  On  the  other  hand 
the  simple  cross-validation  assessment  uses  all  of  the  data  while  the  vali- 
dation assessment  only  uses  a sample  of  the  data.  Notwithstanding,  the 
cross-validatory  assessment  procedure  is  certainly  very  useful  for  the 
comparison  of  predictors  generated  from  various  structural  assumptions  as 
the  basic  dependence  is  the  same  for  all  of  them. 

However  there  are  situations  where  specification  of  a particular 
sampling  distribution  and  the  resultant  predictor  based  on  such  assumptions 
may  be  fraught  with  peril.  When  a particular  sampling  paradigm  becomes  diffi- 
cult or  impossible  to  Identify,  and  yet  prediction  is  necessary,  data  analytic 
techniques  based  on  minimal  assumptions  need  come  to  the  fore.  One  such 
technique,  termed  predictive  sample  reuse  (PSR),  Geisaar  (1974a,  1975*)  or 
croes-validatory  choice,  Stone  (1974a),  is  currently  a leading  candidate  for 
a satisfactory  resolution  of  this  low  structure  case.  It  may  also  he  of 
service  in  what  are  basically  higher  structure  situations  as  we  will  detail 
later.  First  of  all  the  PSR  method,  when  flexibly  ueed,  la  very  likely 
to  be  robust  for  a variety  of  sampling  paradigms.  A second  feature  is  that 
it  simulates  the  predictive  process  upon  itself  in  some  optimal  fashion  often 
using  some  structural  hints.  It  is  even  capable  in  one  of  its  manifestations 
of  comparing  a variety  of  approaches.  Essentially  the  goal  is  to  predict  a 
future  observation  or  set  of  such,  or  some  function  of  them.  For  the  purposes 
of  this  exposition  we  shall  restrict  ourselves  to  a single  future  observation 
with  a form  arbitrarily  chosen  for  predicting  it  as 

x - x(X,Z,z;u)  a € n (4.1) 

where  a is  some  set  of  unknown  values,  X - (x^,...,x^)  represents  a sample 

of  siee  N and  with  each  x^  is  asaoriafed  a known  z^,  and  Z ■ (z^,...,z^). 

It  must  be  stressed  that  in  this  approach  a is  not  a platonic  ideal  nor  in 
any  sense  a true  value  of  paramount  importance.  It  is  to  be  regarded  as  merely 

a convenient  way  of  forming  a predictive  function.  Let  represent  the 
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,th 


1 partition  of  tha  sample  N-n  ratalnad  and  a omitted  obaervatlona 
0 < n fi  M,  where  M la  tha  largeat  integer  such  that  the  predictive  function 
(4.1)  can  be  formed  with  N-M  obaervatlona.  More  precisely,  the  observa- 
tional aet  X and  tha  sat  Z with  which  it  is  associated  are  partitioned 
such  that 


p (N-n)  . /v(N-*0  z(N-n).  „(n)  .(nK 
Pi  'Xir  * ir  * Xio  * Zio  ' 


(4.2) 


.th 


la  tha  1'“  partition  belonging  to  a set  T of  partitions  relevant  to  a 
particular  schema  of  observational  omissions  where  (X^~n)  and 

(xj"\  2 |”^)  represent  the  N-n  retained  and  n omitted  data  sets,  respec- 
tively. Let  the  total  number  of  such  partitions  be  P(N,  n,  r),  or  simply  P. 
The  specified  predictive  function  is  then  applied  to  the  retained  observations 
for  prediction  of  the  omittad  observations  for  each  partition  with  the  unknown 
set  of  values  a estimated  by  means  of  optimising  an  average  discrepancy 
measure,  say, 


><»> 


where  each  element  in  the  sat 


lo 


and 


is  the  form  of  the  predictive  function 

(n) 


io 


d is  a measure  of  the  discrepancy  of  tha  sat  of  values 

the  aet  of  predicted  values  for  given  a 

with  respect  to  u in  some  sense.  On  the  basis  that  this  leads  to  a 
solution  say,  &,  we  obtain  the  predictor  £ ■ x(X,Z,*;fl)  - f. 


from 


D„  (a)  is  then  optimised 
si  i n 


When  predictive  functions  are  to  be  compared  irrespective  of  their 
generation  one  can  use  a cross -validatory  assessment.  For  a given  discrepancy 

measure  we  could  consider  for  the  1^  partition  the  set  of  retained  observa- 
tions and  associated  values  (x^"n)  Z^"n^)  and  partition  this  into  two  sets 

(x|^“2n\  zito)*  From  ^is  reduced  set  of  N-n  observations 

and  associated  values  we  would,  as  previously,  obtain  an  and  compute 

the  discrepancy  (not  necessarily  based  on  the  same  d as  was  used  to  obtain 
the  predictor)  between  the  values  predicted  for  the  n omitted  observations 
and  the  actual  observations  themselves.  Repeating  this  for  each  i we  would 
then  compute  an  overall  discrepancy  measure 


Vn  ■ p-1"'1 


£ d(x 


(n)  £(n)  fY(N-n)  ,(N-n)  „(n). 


ter 


io 


io 


(X 


ir 


Jir 


z£';  at))  (4.4) 
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for  each  predictive  function.  Thia  meeaure  then  would  be  relevant  to 
esseaalng  either  different  predictive  functions  or  various  estimators 
of  a in  terms  of  predictive  discrepancy  for  the  same  predictive 
functions.  We  also  note  that  comparisons  other  than  the  average 

D„  can  be  utilised,  e,g.f  empirical  distributions  of  the  discrepancy 

can  be  compared  for  several  predictors.  A variety  of  applications  of 
PSR  can  be  found  in  the  following  papers,  Gelsser  (1974a,  1974b,  1975a, 
1975b),  Stone  (1974a,  1974b).  Here  we  shall  only  present  one  such  very 
simple  application  involving  a data  based  predictor  which  is  to  be 
combined  with  limited  prior  information.  Let  the  predictive  function  be 


f - a h (X)  + (l-a)  g 


0 < a < 1 


(4.5) 


where  g represents  a prior  guess  at  the  value  to  be  predicted  and  h (X) 
the  data  based  predictor.  We  shall  use  the  squared  discrepancy  measure, 
with  a one-at-a-time  omission  schema  so  that 


1 N 

r1  r. 


N>1(o)  - N"1  (chj  + (l-o)  g-Xj): 


(4.6) 


where  hj  is  of  the  form  h,  but  based  on  N-l  observations,  i.e.  Xj 
has  been  omitted.  Maximisation  of  0^  1 (or)  with  respect  to  a yields 

$ m h if  A * 1 

■ g if  0 (4.7) 


where 


■ u h + (l-a)g  otherwise 


£(V«>(*rs>. 


— -1  ^ — w(x 

particular  if  h ■ x then  for  s2  «=  (N-l)  jE^Xj-x)2  and  t2« 


(4.8) 


t2  + (N-l)' 


if  t2  > 1 


otherwise . 


(4.9) 
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Ihia  procedure  haa  Che  property  that  if  the  a ample  mean  ia  within  one 
aample  standard  deviation  of  the  mean  from  the  prior  guess  g one  uses  g 
otherwise  one  uses  the  linear  combination.  Further  as  the  distance  between 
the  aample  mean  and  g increases  relative  to  the  aample  standard  deviation, 
greater  weight  is  attached  to  the  aample  mean.  Moreover  as  N increases  the 
predictor  tends  asymptotically  to  the  sample  mean. 

In  many  applications  it  would  appear  that  observational  omissions  one- 
at-a-time  are  appropriate.  However  there  are  some  applications  where  this 
may  not  be  the  case.  This  point  and  others  involving  various  schemata  of 
omissions  and  choice  of  relevant  partitions  are  discussed  in  Geisser  (1975a). 

There  have  also  been  various  attempts  to  extend  PSR  point  prediction 
to  seta,  intervals,  and  regions.  It  is  not  yet  clear  as  to  how  satisfactory 
any  of  these  methods  are.  Pertinent  references  are  Geisser  (l97*+b)»  Hlnkley 
(I975),  Butler  and  Rothman  (1975)* 

5,  AN  APPLICATION.  We  now  illustrate  how  some  of  the  previous  method- 
ology might  be  applied  in  practice  to  what  may  be  termed  a simple  survival 
situation.  Suppose  we  have  a random  sample  X^,...,XN  on  an  exponential 

random  variable  X whose  density  is 


f(x|u)  - u > 0,  x > 0.  (5.1) 

Further  suppose  our  prior  objective  or  subjective  information  is  subsumed 
in  a prior  density  for  u,i 

p(tt)  “ U6‘VYW,  V > 0,  6 > 0.  (5.D) 

Here  u takes  the  place  of  9 in  the  high  structure  Bayesian  approach  and 
t a (6,y),  Our  Interest  is  in  predicting  a value  for  the  random 

(n) 

future  observation  given  the  previous  N observations  , say. 

Then  the  predictive  density  for  ia  easily  calculated  to  be 

f(xN+llx^)  3 J P(u|x(n))  f(xN+1|u)  du 

( 5 • ' ) 

a (N  + 5)(Nx  + v)NnV(Nx  + Y + z > 0, 

(n  ) 

where  x is  the  sample  mean  and  p(u|x'  ) is  the  posterior  density  o£  u 

(N) 

given  the  previous  h observations  x'  , Hence  our  forecast  about 

involves  the  hype rparame tees  Y and  6 which  enter  the  problem  via  the 
distribution  of  the  parameter  u-  Before  any  observations  are  taken  one  can 
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also  find  Che  predictive  (marginal)  density  of  the  generic  variable  X, 
namely 

f(x)  « J*  f(x|u)p(ix)dn 

« 6v6/(v  + x)6+1,  x > 0. 


(5.fc) 


Hence  it  is  convenient  and  more  appropriate  from  the  predictive  view  to 
think  about  these  hyperparameters  in  terms  of  predicting  X before  any 
observations  are  taken  rather  than  in  how  they  modulate  the  assumed  prior 
distribution  of  y,.  Therefore,  prior  to  the  sample,  we  have 


where 


e(x)  ■ y/(s  - l)  ■ g 

Var(x)  * 5y^/(6  - 2 ) ( 6 - l)2  = g2(l  + a)/(l  - Ot) 

-<6-  I)’1. 


(5-5) 


Clearly  Var(x)  exists  for  0 < a < 1,  and  E(x)  exists  for  u > 0 
while  the  distribution  exists  for  all  [-1,0],  Hence  if  one  could 

frame  his  prior  opinions  about  the  potentially  observable  values  of  X 
in  terms  of  its  expectation  and  variance  then  one  can  easily  execute 
the  whole  predictive  process  by  solving  for  the  appropriate  values  6 
and  v from  (5.5)  and  substituting  them  in  (5>3)> 

It  is  to  be  noted  that  (5.3)  and  (5.1+)  were  obtained  from  (5.1) 
and  (5.2).  However,  for  the  predlctivist  who  would  prefer  to  start  from 
(5.1)  and  (5.!+)  in  terms  of  convenience  of  framing  his  predictions  this 
is  somewhat  awkward.  Interestingly  enough  in  this  c-ise  starting  with 
f(x|u)  and  f(x)  is  sufficient  to  obtain  p(u)  and  f(xN+Jx),  which  is 

a more  logical  and  appealing  approach  for  the  predictivist . This  is 

possible  here  because  f(x)  is  the  unique  Laplace  transform  of  y,  • p(u). 

Now  as  we  mentioned  previously  positing  all  of  these  assumptions 
yields  the  requisite  information  for  making  probability  statements  about 
a future  value  provided  that  one  has  specified  values  for  g and  a.  How- 
ever while  one  may  often  be  willing  to  hazard  a guess  at  g,  one  may  be 
far  less  willing  to  specify  a value  for  a.  So  in  further  analysis  of  this 
problem  we  may  be  in  a position  such  that  some  of  the  parameters  of  t are 
assumed  known  and  others  unknown.  Assume  then  that  g is  known  but  not  a. 

One  approach  for  estimating  a or  5 is  from  the  marginal  density 


f(xx xn|5,y)  =»  J f(x1,...,xN|y)p(y|  5,y)  dy. 


(5.1’) 


r(N+5 


i i 


r(6)  [nx+yj 


359 


Since  we  assume  g ■ y^y  i»  known  we  let  Y^  - g"*^  and  obtain  for 
N 

Ny  ■ S y, 
i-1  x 


«(*! Vn|6) 


f(N+6)  (6-l)6 

r(fi)  [Ny+6-l]N+6 


(5.7) 


N 

Clearly  Y^«S  is  sufficient  for  6 in  the  above  likelihood.  The 
density  of  S is  then  easily  obtained  to  be 


f(s|6) 


(6-1)5  T(N+6)  a**"1 

r(N)  r(a)  (s+6-i)N+5 


(5.8) 


which  Implies  that  «6  — g2  (ofi;  N,  6)  a Beta  distribution  of  the  second 
kind.  The  method  of  moments  essentially  falls  here  to  yield  a sensible 
estimate  e.g.  E (s)  a N,  which  is  uninformative  relative  to  6 or  a. 
Use  of  higher  moments  tends  to  restrict  the  range  of  $ and  renders  it 
unreasonable  as  an  estimator.  The  reason  that  moment  estimators  are 
basically  Inappropriate  here  is  that  they  assume  the  existence  of  the 
moments  used  and  hence  tend  to  presume  a restriction  on  the  range  of  6, 
whose  restriction  on  the  outset  is  5 > 1.  One  can  use  however  maximum 
likelihood  estimation.  Hence  we  calculate 
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1 
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+ « * •+ 


1 _ N+5 

N-l+6  s+6-1 


(5.9) 


and  one  would  have  to  find  by  one  means  or  another  6 satisfying  » 0. 

An  explicit  solution  for  8 seems  impossible  to  achieve.  One  can  approximate 
(5.9)  hy  using  the  Euler-Maclauren  sum  formula  so  that  we  obtain  for  large  N 


dlogf  . 6 
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+ log' 


N+5 

s+6-1 


N+5  1 

s+n  25  " 2 ( 6+n) 


(5.10) 


This  is  still  quite  formidable  and  when  set  equal  to  zero  still  does  not  yield 
an  explicit  solution  for  6. 


We  now  show  how  PSR  may  be  of  service  even  in  this  high  structure 
situation.  Suppose  we  were  to  predict  a single  value  Xj^  from  (5.3) 

using  the  predictive  mean 

E(XN+i|x  ax)  a (cNx  + g)/(u  N+l).  (5.11) 

Apply  the  PSR  method  for  the  estimation  of  a using  ( 5 . 11 ) as  a 
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predictive  function  and  squared  discrepancy  with  one-at-a-time  omission 
schema  so  that 


i N a(N-l)x.+g 

DN>l(«)  a N u(N-l)+l  " Xj^ 


where  x,  is  the  mean  of  the  observation  with  x^  omitted.  Minimization 


w lie  ae  a . 4 

Of  DNi1W 


with  respect  to  u yields 


1“  ” 


for  t2  > 1 


u - 0 


for  tp  is  1 


where  t2  » N(g-x)2/s2  and  s;;  ■ N_1  £ (x. -x)2.  Hence  PSR  may  be  used 

i-1  1 

to  generate  estimates  even  in  the  high  structure  case.  On  the  other  hand 
using  (5.11)  and  (5. 12)  as  a predictive  function  and  discrepancy  measure 
respectively  yields  a PSR  predictor 

’Wi  “ (o  Nx  + g)/(o  N+l)  (5- 14 


that  does  not  strictly  depend  on  high  structure  assumptions.  In  fact  it 
may  be  robust  for  a variety  of  high  structure  assumptions  which  result  in 
a predictive  expectation  approximately  equal  to  (5. 11).  Actually  if  one 
did  not  use  any  high  structure  hint  for  a predictive  function  for  this 
problem  but  merely  used  a convex  combination  of  sample  mean  and  prior  guess 


XN+1  = “*  x + 


0 £ a*  £ 1, 


then  the  result  for  0*  was  already  obtained  in  section  4 as 


(t2-l)  x + 

ea  (N-l)-1 


if  t2  > 1 


if  t'‘  <-  1 


This  may  be  contrasted  with  (5*44)  when  the  value  for  a is  inserted 
which  turns  out  to  be 


XN+1  “ 


it  -1)  x •-  £ 


t;;  > 1 


t'J  < 1. 


(5.17) 
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The  predictor  in  (5.I7)  is  weighted  slightly  more  towards  x than 

(5.16),  but  in  fact  they  are  asymptotically  equivalent  to  order  N 
In  any  practical  example  there  would  probably  not  be  much  to  choose 
between  them. 

It  is  also  to  be  noted  that  the  intermediate  structures  are  difficult 
or  impossible  to  apply  in  situations  such  as  this  one  where  there  may  be 
some  prior  information  that  should  be  taken  into  account. 

6.  REMARKS.  A somewhat  abbreviated  exposition  of  the  predlctivistic 
view  has  been  presented.  This  view  is  not  a mode  o':  inference  as  such  but 
can  be  Implemented  from  a variety  of  inferential  modes.  It  stems  from  the 
attitude  that  inferences  should  be  restricted  to  potentially  observable 
entities  unless  compelling  reasons  to  contrary  exist.  In  conformance  with 
this  view  we  have  presented  various  ways,  arising  from  different  stand- 
points, of  implementing  the  predictive  approach.  In  particular  a recently 
developed  low  structure  approach  PSR  has  also  been  delineated  in  some- 
what greater  detail,  which  should  be  of  great  value  in  many  situations  and 
need  be  added,  we  believe,  to  the  toolkit  of  every  statistician. 
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Arlington,  Virginia 

When  confronted  with  the  prospect  of  drawing  order  out  of  complex 
human  behavior  in  the  equally  complex  world  of  work,  two  primary  charac- 
teristics have  marked  traditional  behavioral  science  research.  First, 
heavy  reliance  has  been  placed  upon  human  evaluations  or  ratings  of 
other  humans.  Secondly,  these  performance  or  trait  ratings  have  been 
predominantly  gathered  from  a limited  observational  viewpoint,  namely 
the  supervisor.  The  technique  outlined  in  the  present  paper  does  not 
deviate  from  the  first  of  these  characteristics;  it  does  rely  on  human 
evaluation  of  other  humans.  However,  it  goes  beyond  the  second  charac- 
eristic  by  gathering  such  evaluative  information  from  the  additional 
perspective  of  an  individual’s  peers.  For  purposes  of  the  present 
paper,  peers  are  operationally  defined  by  their  sharing  of  some  common 
purpose  (e.g.,  members  of  the  same  work  group),  and  generally  by  the 
lack  of  a formally  recognized  authority  relationship  between  them.  The 
term  associate  will  be  used  interchangeably  with  peer. 

The  history  of  peer  evaluations  can  be  traced  back  to  post  World 
War  II  work  by  Williams  and  Leavitt  (1947). ^ The  history  of  the  techni- 
que  can  be  traced  back  even  further  to  the  original  work  of  Moreno  (1934) 
and  his  development  of  the  sociogr.im  technique.  Since  that  time,  peer 
evaluations  have  been  used  for  two  primary  purposes.  The  first  of 
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World  War  II.  See,  for  example,  Clarke  ( 19 4 '»)  , 
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these  purposes  Is  evaluative  in  the  criterion  sense  (i.e.,  leadership 
effectiveness,  job  performance,  etc.).  The  second  purpose  is  evaluative 
in  the  sense  of  predicting  future  behavior  or  success  (i.e,,  motivation 
to  work,  goal  orientation,  potential,  etc.).  Llnd2ey  and  Byrne  (1968) 
have  presented  an  excellent  review  of  the  use  of  social  choice  method- 
ology of  which  peer  evaluations  are  one  type.  More  specialized  reviews 
of  the  work  are:  Gibb  (1961),  Gibb  (1969),  Hollander  (1954),  Boulger 

and  Colmen  (Note  3) , and  Nadal  (Note  4) . 

Aside  from  considerations  about  the  use  of  peer  evaluations, 
another  major  issue  centers  on  what  the  dimension  Is  which  peers  are 
evaluating.  For  instance,  previous  research  has  been  directed  at  peer 
evaluations  of  leadership  (Hollander,  1965),  personality  traits  (Tupes 
and  Chrlstal,  Note  5),  and  supervioor  skills  (Veitz,  195G)  to  name  but  a 
few  of  the  dimensions  which  have  been  investigated.  While  we  will  not 
directly  address  the  issue  of  which  dimension  is  measured,  it  is 
probably  the  single  most  important  decision  the  researcher  makes  in 
the  design  of  the  experiment. 

Given  this  short  background  we  will  address  two  major  areas  which 
relate  to  the  development  of  a peer  evaluation  system;  first,  method- 
ological considerations  and  second,  situational  factors  which  could 


impact  upon  the  evaluative  process. 

To  t..cilitate  understanding  of  the  methodological  issues,  they  will 
be  described  in  terms  of  effects  upon  the  major  scaling  techniques 
available,  of  which  there  are  four:  ratings,  rankings,  full  nominations 
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and  high  nominations.  A summary  of  the  following  discussion  is  provided 
in  Pigdre  1. 

Methodological  Issues 

The  general  paradigm  of  the  rating  technique  calls  for  a group 
member  to  provide  a rating  of  the  relative  amount  or  degree  of  the 
dimension  under  consideration  possessed  by  every  other  group  member. 

The  ranking  procedure  simply  requires  each  group  member  to  rank  order 
every  other  group  member  from  high  to  low  (or  some  other  relevant 
continuum)  on  the  dimension  under  consideration.  The  full  nomination 
technique  requires  that  each  group  member  choose  a specified  number  or 
proportion  of  the  group  as  being  either  high,  medium,  or  low  on 
the  dimension.  In  the  present  paper,  the  minor  variation  of  this 
technique  whenever  middle  or  medium  nominations  are  not  required 
will  also  be  referred  tc  as  full  nominations.  However,  the  case  where 
only  high  nominations  are  elicited  is  reserved  as  a discriminably  different 
technique  for  reasons  to  be  elaborated  in  later  portions  of  the  paper. 
Several  variations  based  on  combinations  of  these  basic  techniques  are 
forced  distribution  rankings  or  combinations  of  rankings  and  ratings  or 
nominations.  General  scoring  algorithms  for  the  four  primary  techniques 
are  presented  below: 

Ratings 

Score  “ ^rRt 
N 


Rankings 


Score  * ^ E^Rkj  x ^ ^ j 
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Full  Nominations 


Score  ■ S rH 
N 

where  rRt  ■ Rating 
rRk  " Ranking 

rL  ■ Low  nomination 

rjj  * Mid  (or  no)  nomination 

rj|  ■ High  nomination 

N ■ Number  giving  an  evaluation 

Nip  ■ Total  number  in  the  group 

By  inspection,  several  characteristics  of  these  formulae  should 
be  noted.  All  of  these  techniques  produce  scores  which  are,  in 
general,  independent  of  group  size  wiLh  the  exception  of  the  rank- 
ing formula  in  which  case  adjustment  must  be  made  for  group  sizes 
greater  than  100.  It  can  also  be  seen  that  the  average  score  for  a 
group  using  either  a ranking  or  nomination  technique  is  determined; 
the  average  score  for  the  rating  technique  is  free  to  vary. 

Metric  and  Distribution 

The  metric  and  distributional  properties  of  associate  evalua- 
tions are  directly  related  to  the  particular  tochniqi . employed. 

With  respect  to  the  scaling  properties  of  the  various  techniques,  the 
rankings  an.'  Loth  nominations  from  an  evaluator  are  ordinal  data 
(Stevens,  1951).  The  ratings  from  an  evaluator  are  the  most  nearly 
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interval  data  although  here  alao  it  can  be  argued  that  these  are 
merely  ordinal  data.  The  scaling  properties  of  the  summated  scores 
from  the  various  techniques  approximate  Interval  data  as  the  number 
in  the  evaluation  group  Increases. 

In  addition,  the  4 most  common  procedures  will  commonly  produce 
different  distributions,  examples  of  which  are  displayed  In  Figure  2. 
Given  the  free  response  mode  for  ratings,  they  will  often  produce 
negatively  skewed  distributions  due  largely  to  group  norms  to  inflate 
any  evaluative  procedure.  The  ranking  procedure,  if  it  were  perfect- 
ly reliable,  would  produce  a rectangular  distribution  with  one  person 
at  each  rank.  Generally,  less  reliable  rank  scores  will  tend  to  be 
normally  distributed  with  even  less  reliable  scores  producing  a more 
leptokurtic  curve,  and  a perfectly  unreliable  test  producing  a point 
distribution  with  everyone  receiving  an  average  rank  equal  to  the 
middle  rank.  Full  nomination  scores  produce  a distribution  which, 
if  perfectly  reliable,  is  tri-modal  with  one  group  receiving  all 
high  nominations,  a group  with  all  low  nominations  and  the  remainder 
having  middle  nominations  or  none  at  all.  High  nominations  only  pro- 
duce a bi-modnl  distribution  (not  shown  in  Figure  2). 

Basis  of  Comparison 

Scores  which  result  from  the  four  primary  techniques  vary  along 
another  important  dimension;  that  1b,  the  internal  process  evoked 
in  the  evaluator  upon  which  he  makes  his  judgement.  In  one  case,  the 
evaluator  compares  the  particular  individual  against  some  external 
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(to  the  group)  frame  of  reference  and  aaaigna  him  to  aome  category. 

In  the  second  case,  the  evaluator  compares  the  particular  indivi- 
dual against  some  Internal  (to  the  group)  frame  of  reference  and 
makes  a judgement  of  more  or  less  and  assigns  him  to  the  appropriate 
category.  The  external  process  can  only  be  used  with  the  rating 
procedure.  The  Internal  process  can  be  used  with  the  ratings,  but 
it  must  be  used  with  rankings  and  nominations.  It  should  be  noted 
that  the  internal  process,  in  general,  requires  a moderate  number  of 
individuals  in  the  group  (more  than  5).  The  direct  implication  of 
this  distinction  is  that  the  external  frame  of  reference  allows  both 
comparison  between  individuals  across  peer  groups  and  the  comparison 
of  peer  groups.  The  Internal  process  does  not  allow  comparison 
between  individuals  across  peer  groups  unless  the  assumption  is 
accepted  that  the  groups  are  equal  on  the  particular  ability,  trait 
or  behavior. 

The  corollary  of  this  implication  is  that  population  norms 
can  be  developed  only  through  the  use  of  a rating  procedure  and  an 
external  frame  of  reference. 

Reliability 

The  reliability  of  associate  evaluations  has  generally  been 
determined  by  one  of  two  methods.  Internal  consistency  or  test-rctest. 
Both  methods  are  analogous  to  the  same  procedures  in  classical  test 
theory  (Lord  and  Novick,  1968). 

The  Internal  consistency  of  peer  evaluations  is  the  degree  to 


368 


I 


which  members  of  « peer  group  agree  with  one  another  when  observing 
an  Individual  in  a similar  situation  and  At  the  same  time.  Using  the 
multiple  choice  test  paradigm,  the  evaluators  are  comparable  to  the 
teat  items  and  those  who  are  being  evaluated  are  comparable  to  the 
people  taking  the  test.  While  Gordon  (1969)  has  recommended  the  use 
of  the  alpha  coefficient  for  estimating  the  Internal  consistency  or 
reliability  of  peer  evaluations,  the  most  common  procedure  has  been 
a split-half  (or  group)  estimate.  The  split-half  estimate  is  made 
by  computing  scores  for  all  group  members,  randomly  assigning  peer 
group  members  to  one  of  two  groups,  and  then  correlating  the  scores  for 
each  ratee  from  each  group  (See  Hollander,  1957;  and  Downey,  Note  6). 
The  correlation  is  then  adjusted  for  the  total  group  size  ur.ing  the 
Spearman-Brown  formula  (Gulllksen,  1950).  If  small  groupB  are  used, 
a random  split  may  not  be  possible  and  some  technique  for  averaging 
the  lntercorrelatlona  between  evaluators  could  be  used  (Gulllksen, 
1950). 


The  test-retest  method  of  estimating  reliability  requires  that 
group  members  evaluate  each  other  at  two  different  times.  Scores 
from  the  two  different  evaluations  are  then  correlated.  Examples  of 
thLs  type  of  estimate  are  given  in  Hollander  (1957),  Downey  (Note  6), 
and  Downey  (Note  7).  Perhaps  the  most  rigorous  examination  of  relia- 
bility was  done  by  Gordon  and  Mcdland  (1965)  where  they  varied  both 
time  and  group  doing  the  evaluations  and  found  reliabilities  in  the 
80's. 
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Research  has  generally  found  Che  reliability  of  peer  evaluations 
to  be  in  the  .70  to  .90  range,  regardless  of  the  reliability  estimate 
employed.  Research  which  haB  compared  the  various  evaluative  method- 
ologies is  rare,  but,  in  general,  has  supported  the  view  that  all  four 
methods  are  quite  similar  with  maybe  a slight  advantage  to  ratings 
(See  Suci,  Vallance,  and  Glickman,  Note  8;  Downey,  Note  6;  and  Hammer, 
Note  9) . Even  the  use  of  a paired  comparison  procedure  does  not 
significantly  improve  reliability  (Bolton,  Note  10).  The  selection  of 
a particular  technique  will  rarely  be  decided  by  differences  in 
reliability  between  the  techniques. 

Acceptability 

A major  factor  in  the  success  or  failure  of  a particular  research 
program  is  the  degree  of  involvement  and  commitment  to  the  program 
on  the  part  of  the  participants,  in, other  words,  acceptability. 
Acceptability  is  generally  studied  as  a specific  issue  of  the  particu- 
lar program  under  investigation  rather  than  comparative  analyses  of 
acceptability  across  techniques  or  situations.  There  is,  therefore, 
little  formal  evidence  of  differences  between  techniques  but  many 
inferences  can  be  drawn  based  upon  the  particular  qualities  of  the 
technique.  A major  factor  in  the  acceptability  of  a technique  is  the 
degree  of  perceived  difficulty.  From  this  point  of  view,  both  the 
rating  and  ranking  of  large  numbers  of  people  (greater  than  20)  can 
be  time  consuming  and  makes  for  difficult  discriminations  among  the 
overage  members  of  the  group.  On  the  other  hand,  the  nomination 
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procedure  allows  the  individual  to  place  a large  number  of  people 
in  a desired  category  and  does  not  force  him  to  make  difficult  discrimi- 
nations. 


The  rating  procedure  is  quite  acceptable  where  the  group  is  small 
and  cohesive.  The  full  nomination  technique  is  acceptable  for  moder- 
ate to  large  sise  groups  where  not  all  individuals  are  well  known  to 

i 

one  another.  The  high  nomination  technique  is  even  more  acceptable 
1 because  it  does  not  require  an  individual  to  make  negative  evaluations. 

i 

A major  determinant  of  the  degree  of  acceptability  is  the  degree 
to  which  group  members  are  knowledgeable  about  the  evaluation  procedure , 

| process,  background  and  use.  Downey  (Note  11)  found  that  accept- 

i ability  improved  as  a function  of  an  educational  program.  Two  differ- 

ent types  of  attitudes  were  found;  first,  the  degree  to  which  peer 
evaluations  were  felt  to  be  valuable  and  accurate  estimates  and, 
second,  the  degree  to  which  they  were  acceptable  for  particular  uses. 

Downey  also  found  that  the  peer  evaluations  and  acceptance  were 
positively  related,  with  larger  relationships  being  found  in  the  group 
with  less  information  on  the  peer  evaluation  process. 

Feasibility 

Closely  linked  with  the  previous  concert  of  acceptability  is 
feasibility,  or  costs  associated  with  the  implementation  and  execution 
of  a particular  peer  evaluation  system.  The  major  costs  associated 
with  a peer  evaluation  system  are:  (1)  preparation  of  evaluation 

materials,  (2)  administration  time,  and  (3)  scoring  cost.  Prior  to 
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the  advent  of  automatic  data  processing  procedures,  the  costs  associ- 
ated with  any  peer  evaluation  system  with  large  groups  or  on  a large 
scale  were  prohibitive.  Merely  in  terms  of  bits  of  information 
collected,  it  can  be  seen  that  the  number  of  evaluations  is  equal  to 
N2  where  N 1b  the  number  in  the  group.  Figure  3 presents  the  compara- 
tive costa  associated  with  each  of  the  four  techniques. 

As  can  be  seen  from  Figure  3,  each  of  the  4 techniques  incur 
equally  high  costs  associated  with  the  preparation  of  a list  of  the 
peers.  It  is  important  that  all  evaluators  be  provided  with  a full 
list  of  all  other  members  of  the  peer  group.  The  administration  time 
for  the  full  nomination  technique  is  low  due  to  the  small  number  of 
decisions  associated  with  making  the  low  and/or  high  choices.  An 
excessive  amount  of  time  is  devoted  to  making  fine  discriminations 
in  the  ranking  procedure,  whereas  a moderate  amount  of  time  is  taken 
up  by  the  rating  of  every  individual. 

The  scoring  of  the  peer  evaluations  normally  requires  access  to 
some  sort  of  automatic  data  processing  facility  in  all  but  the  small- 
est scale  operations.  The  actual  computer  cost  is  virtually  equal 
for  all  techniques,  but  they  can  differ  substantially  in  terms  of  the 
costs  associated  with  getting  the  evaluations  into  a data  processable 
form.  Costs  vary  by  technique  as  u function  of  using  either  keypunch- 
ing or  optical  scanning,  Both  the  full  and  high  nomination  techniques 
involve  low  cost  and  ratings  also  have  low  costs  associated  with 


optical  scorning.  Rankings  produce  high  costs  in  both  keypunching 


and  optical  acanning  and  ratings  have  high  costa  associated  with 
keypunching.  Generally,  nominations  produce  the  lowest  costs  overall 
followed  by  ratings  with  rankings  having  the  highest  costs  overall. 

It  should  be  noted  that  peer  evaluation  systems  are  relatively  costly 
efforts  which  typically  require  more  than  minimal  sophistication 
with  data  processing  procedures. 

Situational  Factors 

In  addition  to  the  methodological  concerns  of  the  various  techni- 
ques presented  in  the  previous  section,  there  are  also  a variety  of 
situational  or  contextual  factors  which  can  impact  upon  a peer  evalua- 
tion system,  often  regardless  of  the  specific  technique  under  discus- 
sion. Among  these  factors  are  group  else,  informal  group  structures, 
demographic  characteristics,  group  boundaries,  hierarchical  character- 
istics, friendships,  length  of  association  and  type  of  interaction. 
Each  of  these  factors  will  be  discussed  in  turn  and,  where  appropriate, 
specific  mention  will  be  made  of  their  effect  upon  the  various 
techniques. 

Site 

Very  few  attempts  have  been  made  to  study  the  independent  effects 
of  group  size.  More  often  than  not,  what  evidence  there  is  for  the 
effects  of  group  size  has  been  reported  as  a byproduct  in  studies 
directed  at  some  other  purpose.  For  example,  Downey,  Medland,  and 
Yates  (Note  12),  used  a peer  nomination  technique  with  groups  of 
Army  Colonels  in  14  career  groups  which  varied  in  size  from  22  to  321. 
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Reliabilities  varied  from  .63  Co  .94  and  the  rank  order  correlation 
between  group  size  and  reliability  was  .03.  Downey  (Note  7),  in 
a sample  of  Army  Rangers>  compared  peer  ratings  collected  within 
squads  (n  • 10)  with  peer  nominations  collected  on  the  same  men 
within  platoons  (n  “■  40).  Correlations  between  the  two  scores  were 
in  the  ,60's.  However,  there  were  indications  that  the  platoon 
Bcorcs  were  both  more  reliable  and  more  predictive  of  job  performance. 

As  mentioned  previously,  from  the  standpoint  of  feasibility, 
both  ratings  and  rankings  would  seem  to  be  most  appropriate  for 
relatively  small  group  sizes  (i.e.,  approximately  a dozen),  while 
the  nomination  technique  is  virtually  mandatory  for  large  group 
sizes  (i.e.,  greater  than  50).  From  the  standpoint  of  empirical 
results,  it  appears  that  small  groups  ay  produce  unreliable  scores 
with  reduced  validity.  Alternatively  , while  t is  rational  to  believe 
that  there  is  an  optimal  upper  size  peer  group,  there  is  scant 
evidence  to  support  this  view. 

Informal  Croup  Structures 

Given  the  well  documented  fact  that  within  any  formally  defined 
group  there  may  exist  one  or  more  Informal  subgroups  defined  by  some 
sort  of  mutual  self  interest,  the  Issue  arises  as  to  what  effect  these 
informal  subgroups  may  have  on  a peer  evaluation  procedure  conducted 
in  the  total  group  for  a purpose  other  than  finding  what  subgroup 
Structure  exists. 

For  example,  the  worst  case  would  be  one  in  which  two  equaL- 
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sized  informal  subgroups  existed  within  a total  group  and  included 

* 

each  group  member  exclusively  in  one  or  Che  ocher.  In  such  a situa- 
tion, ic  can  be  assumed  that  one  or  both  subgroups  might  make  their 
evaluations  solely  on  the  basis  of  subgroup  membership,  i.e.,  on  a 
basis  ocher  than  the  one  intended.  The  net  effect  of  such  behavior 
is  to  attenuate  Che  validity  of  the  peer  evaluation  procedure,  and 
it  is  most  pronounced  when  both  subgroups  engage  in  such  behavior. 

The  effect  diminishes  if  one  of  the  groups  does,  in  fact,  provide 
evaluations  on  the  dimension  intended.  The  effect  also  diminishes 

1 

as  informal  subgroup  size  decreases  or  as  the  number  of  subgroups 
Increases, 

In  terms  of  technique,  the  effect  of  subgroup  behavior  will  be 
pronounced  if  ratings  or  rankings  are  used  with  resultant  scores 
most  likely  to  be  negatively  skewed.  The  use  of  full  nominations 
will  tend  to  produce  scores  with  decreased  variance,  and  high  nomina- 
tions will  produce  the  worst  case  with  a drastic  reduction  in  variance. 

It  is  clear  that  subgroups  of  sufficient  size  can  have  an  effect 
upon  the  final  scores,  and  therefore  the  question  is  the  incidence 
of  such  effects  and  whether  there  exists  a mechanism  for  detecting 
its  occurrence.  The  simplest  procedure  for  checking  for  these  problems 
is  the  repetitive  production  of  reliability  indices  as  part  of  the 
procedure  for  producing  peer  scores.  If  the  reliability  coefficients 
were  to  drop  below  ,60,  It  would  seem  to  Indicate  a problem  and  care 
should  be  taken  iu  use  of  the  evaluations,  If  the  evaluation  process 
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is  part  of  an  ongoing  process,  then  the  use  of  a two-way  analysis 
of  variance  design  with  one  factor  being  the  types  of  raters  and 
the  other  factor  being  the  same  types  of  ratees  should  be  used. 

If  a significant  interaction  were  found,  then  a strong  case  could 
be  made  for  peer  scores  being  at  least  partially  the  result  of  group 
membership. 

Demographic  Characteristics 


The  use  of  peer  evaluations  with  their  reliance  upon  fallible 
human  observers  immediately  raises  the  possibility  of  racial  and  sexual 
bias  on  the  part  of  evaluators.  This  concern  is  especially  crucial 
in  view  of  recent  problems  associated  with  demonstrating  the  absence 
of  bias  in  employment  selection  and  classification  measures  as  well 
as  criterion  measures. 

The  evidence  concerning  racial  bias  in  peer  evaluations  is  mixed 
and  inconclusive.  In  a study  dealing  with  Air  Force  recruits,  Cox 
and  Krumboltz  (1958)  found  that  subjects  were  rated  higher  by  members 
of  their  own  race,  but  the  effect  varied  across  groups  and  there 
was  substantial  agreement  on  rank  order  across  races  (r  -*  .76). 

They  conclude  that  t he  bias  which  might  exist  is  far  from  complete 
and  suggest  that  prior  acquaintanceship  of  group  members  may  account 
for  the  differences.  In  a similar  study  in  the  Army,  dcJuug  and 
Kaplan  (19G2)  found  similar  results  with  ratings  differing  as  a 
function  of  the  rater's  race.  However,  an  analysis  of  covariance 
adjusting  for  a combined  interest  and  math  score  showed  that  whit  os 
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did  not  give  higher  adjusted  scores  to  whites  or  blacks,  but  blacks 
did  give  higher  adjusted  scores  to  blacks.  Results  were  interpreted 
in  terms  of  assignment  of  higher  scores  to  close  acquaintances  which 
had  more  of  an  impact  upon  blacks  rating  blacks  due  to  the  smaller 
group  size. 

In  a more  recent  study  in  an  industrial  training  context,  Schmidt 
and  Johnson  (1971)  used  a forced  choice  rating  distribution  in  groups 
with  approximately  equal  numbers  of  blacks  and  whites.  No  differences 
due  to  race  were  found, 

The  evidence  suggests  that  peer  evaluations  can  be  subject  to 
racial  bias,  but  the  effect  is  perhaps  more  strongly  related  to  the 
interaction  between  friendship  or  acquaintanceship  and  the  particular 
evaluation  method  used.  The  presence  of  substantial  correlations 
between  the  rank  orderings  from  each  race  indicates  that  a similar 
view  prevailed.  Rut,  the  use  of  ratings  allows  evaluators  to  assign 
unrelated  scores  to  individuals  whom  they  consider  special,  in  some 


In  terms  of  sexual  bias,  Mohr  and  bovuev  (Note  HI  recent 'y 
reported  results  from  a small  sample  of  Army  officers  which  Indicate' 
that  females  scored  lower  than  males  on  scores  received  from  both 
males  and  females.  If  bias  occurred.  It  was  on  t he  part  of  both 
groups.  An  interesting,  finding  was  that  females'  self-ratings  were 
not  related  to  either  male  or  femal-  evalu.it  ions  but  males'  self- 


ratings were  related  to  these  evaluations. 


This  admittedly  small  numLer  of  studies  appears  to  indicate  that 
differences  based  upon  race  and  sex  can  occur,  but  it  is  unclear  whether 
these  differences  are  attributable  to  race  or  sex  group  differences, 
to  Interaction  patterns  (i.e.,  friendships,  etc.),  to  the  specific 
methodology,  or  some  combination  of  all  of  these  factors.  It  would 
certainly  be  safe  to  say  that  researchers  should  be  sensitive  to  the 
potential  for  such  bias. 

Group  Boundaries 

The  discussion  of  peer  evaluations  has  proceeded  to  this  point 
as  if  it  were  clear  just  what  is  meant  by  a peer  or  associate  group. 

Most  researchers  report  their  procedures  in  sufficient  detail  to  show 
the  general  characteristics  of  the  groups  which  were,  in  fact,  used. 
However,  given  that  there  are  a variety  of  overlapping  and  higher 
order  groups  in  most  real-life  settings,  the  issue  becomes  that  of 
defining  some  basic  guidelines  for  selecting  the  appropriate  rating 
group.  It  is  clear  that  the  selection  of  the  evaluative  group  can  be 
effected  by  such  factors  as  the  length  and  type  of  interaction, 
formal  organizational  structure,  informal  group  structure,  friendship 
patterns  and,  of  course,  the  particular  dimension  being  evaluated. 

As  has  been  the  case  for  several  of  the  preceding  issues,  there 
is  little  empirical  data  to  guide  the  selection  of  the  group.  Rather, 
guidelines  must  be  best  guesses  based  on  partial  information  from 
related  data. 

In  the  previously  mentioned  study  by  Downey  (Note  7),  it  was 


found  that  platoon  evaluations  produr  d more  reliable  and  slightly 
more  valid  scores  than  squad  evaluations)  but  the  differences  were 
potentially  confounded  by  differences  between  both  method  and  size. 

A study  by  Gordon  and  Medland  (1965),  in  which  individuals  were 
evaluated  at  two  different  times  by  totally  different  groups  of 
different  structure,  indicated  a high  degree  of  stability  across  the 
two  evaluations.  Even  the  method  which  was  used  to  compute  reliability 
indices,  random  splits  of  the  primary  group,  supports  the  notion  that 
group  composition  can  be  drastically  altered  without  major  problems 
arising  in  producing  reliable  and  valid  scores. 

A concept  related  to  that  of  group  boundaries  is  chat  of  hierarch- 
ies. For  example,  an  Army  platoon  is  made  up  of  4 squads,  each  headed 
by  a squad  leader.  If  the  platoon  is  chosen  as  the  peer  group,  the 
issue  is  whether  the  squad  leaders  should  be  included  in  the  process 
Folklore  holds  that  the  inclusion  of  such  individuals  will  often  work 
to  their  disadvantage,  and  therefore  they  should  be  excluded  from  the 
platoon  peer  group  and  included  in  a peer  group  of  squad  lenders. 

Research  by  Levi,  Torrance,  and  Pletts  (1958)  indicated  no  effects 
from  including  the  formal  leader  in  the  peer  evaluation  process. 
Research  by  Downey  (Note  14),  in  which  the  leaders  of  small  combat 
units  were  included  in  the  peer  nomination  process,  indicated  that 
the  leaders  spanned  the  full  range  of  leadership  potential  scores. 

And,  rather  than  being  penalized,  there  was  a positive  relationship 


between  formal  position  and  peer  evaluation  scores  (as  there  should 
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be  If  the  selection  procedure  for  lenders  had  any  validity  originally). 

It  should  be  pointed  out  that  these  data  were  experimental  and 
the  Introduction  of  an  operational  system  may  change  the  situation 
depending  upon  the  use  to  which  the  resulting  evaluations  will  be  put. 

A rational  solution  to  the  problem  should  be  guided  by  the 
following  suggestions: 

(1)  Select  the  group  to  have  sufficient  size  to  overcome  problems 
associated  with  primary  groups. 

(2)  Group  size  should  not  be  so  large  as  to  produce  subgroups 
which  may  be  relatively  unknown  to  each  other  or  be  competing  for 
similar  resources  and  rewards. 

(3)  Groups  selected  should  be  somehow  reasonably  related  to  the 
dimension  to  be  evaluated,  a.g.,  if  evaluation  of  leadership  in  a work 
setting  is  desired,  select  a work  group  and  not  a social  group. 
Friendship 

Friendship  has  been  a major  research  issue  in  the  history  of 
peer  evaluations.  This  is  another  case  where  folklore  has  stated 
that  peer  evaluations  are  the  product  of  friendship  or  popularity  and 
are  therefore  not  valid  indications  of  the  dimension  under  considera- 
tion. The  impact  of  this  bit  of  folklore  has  been  that,  with  the 
exception  of  simple  validity  studies,  this  is  probably  the  single 
most  researched  question  associated  with  peer  evaluations. 

Wherry  and  Fryer  (1949)  were  the  first  to  address  the  issue. 

They  reported  that  although  there  was  a moderate  degree  of  relation- 
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•hip  between  friendship  and  a leadership  criterion,  the  major  portion 
of  the  predicted  criterion  variance  was  independent  of  friendship. 

They  concluded  that  peer  evaluations  of  leadership  are  not  popularity 
contests.  Studies  by  Gibb  (1950)  and  Hor rocks  and  Wear  (1953)  in 
college  samples  support  Wherry  and  Fryer's  findings.  Borgatta  (1954) 
also  reported  that  leadership  and  popularity  evaluations  were  related, 
but  he  failed  to  draw  any  conclusions.  Several  other  studies  have 
documented  a moderate  degree  of  relationship  between  friendship  and 
peer  evaluations  of  leadership  Hollander,  1956;  Hollander  and  Webb, 
1955;  Theodorson,  1957). 

Downey  (Note  6)  recently  presented  evidence  that  the  use  of  full 
nominations  (with  small  numbers  of  high  and  low  nominations  required) 
reduced  the  correlation  between  friendship  and  leadership  evaluations 
compared  with  forced  distribution  ratings. 

It  would  seem  that  when  an  evaluator  is  faced  with  a choice  of 
how  to  evaluace  a friend,  he  will  tend  to  select  a friend  rather  than 
another  person  he  considers  of  equal,  or  at  least  indistinguishable, 
merit.  Therefore,  the  variance  associated  with  friendship  may  be  a 
source  of  systematic  error  primarily  in  the  middle  of  the  distribution. 
This  systematic  error  variance  will  Increase  in  large  groups  where 
some  members  are  relatively  unknown  to  each  other  or  the  interaction 
patterns  are  not  fully  established  for  all  members. 

Even  in  the  face  of  the  impressive  research  findings  demonstrating 
the  invalidity  of  the  "popularity  contest"  issue,  this  remains  as  the 
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moot  consistent  argument  agsinst  the  use  of  peer  evaluations  in  an 
operational  setting.  A corrollary  of  this  objection  ie  the  feeling 
that  peer  evaluators  do  not  stake  the  right  choice , the  best  counter- 
argument to  which  is  the  impressive  list  of  validity  studies  on  peer 
evaluations. 
length  of  Association 

When  peer  evaluations  are  considered  for  use  in  any  situation, 
an  Important  question  concerns  how  long  group  members  must  have  been 
in  contact  with  each  other  before  reliable  and  valid  evaluations  can 
be  provided.  For  example,  this  issue  is  often  raised  in  the  context 
of  transient  training  groups.  Research  is  fairly  consistent  in  find- 
ing that  peers  can  make  reliable  and  valid  evaluations  after  a relatively 
short  period  of  time  (typically  3 to  6 weeks) . 

Subsidary  to  the  overall  issue  ia  the  question  of  the  effect  of 
including  a new  group  member  in  an  intact  group.  Mayfield  (Note  14) 
has  suggested  that  in  such  a situation  there  may  be  reason  to  suspect 
that  a longer  period  of  acquaintanceship  is  necessary  for  sufficient 
integration  into  the  group  to  occur.  A more  generalized  way  of 
approaching  the  question  is  the  extent  to  which  a person  is  known  or 
not  known  to  other  members  of  the  group.  Evidence  has  shown  that 
an  individual  not  well  known  to  other  members  of  the  group  will 
typically  be  evaluated  as  lying  near  the  middle  of  the  distribution 
within  the  group  (Downey,  Note  6) . 

In  terms  of  technique,  a nomination  procedure  is  most  likely  to 
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decrease  the  error  variance  associated  with  acquaintanceship  while 
ratings  or  rankings  would  tend  to  capitalize  on  the  error  variance 
and  show  a greater  degree  of  relationship  with  such  measures. 

Type  of  Interaction 

While  the  use  of  peer  evaluations  has  been  extensive  over  a span 
of  more  than  twenty-five  years,  they  have  nevertheless  been  applied  in 
rather  limited  situations.  In  fact,  tho  majority  of  the  research  has 
been  conducted  with  junior  personnel  in  a military  training  context. 
Recent  work  outside  the  military  by  Weitz  (1958)  and  subsequent  follow- 
ups by  Mayfield  (1970;  Note  15)  has  been  conducted  in  industry  with 
insurance  salesmen.  There  has  also  been  a recent  effort  to  use  a 
peer  nomination  process  in  a senior  Army  officer  promotion  system 
which  produced  supportive  results  (Downey,  Medland,  and  Yates,  Note  12). 
But,  until  more  extensive  research  is  conducted  in  broader  organiza- 
tional contexts  with  a wider  selection  oi  subject  populations,  the 
generality  of  the  peer  evaluation  process  is  largely  a matter  of 
conjecture, 

A related  issue  is  tho  type  of  interaction  required  to  produce 
valid  evaluation.  Freeberg  (1969)  reported  a study  in  which  peer 
evaluations  were  more  highly  related  to  a performance  criterion  when 
the  interaction  between  peers  was  relevant  to  the  dimension  being 
evaluated.  Bayroff  and  Machlin  (Note  16)  found  that  leadership 
evaluations  could  be  made  in  an  academic  environment  and  were  highly 
related  to  evaluations  done  after  exposure  to  a situation  where 
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leadership  wet  displayed.  Levin,  Dubno,  and  Akula  (1971)  indicated 
that  video  tapes  supplied  sufficient  information  for  reliable  evalua- 
tions and  were  highly  related  to  evaluations  from  group  members. 

It  would  be  safe  to  assume  that  peer  evaluations  of  a variety 
of  complex  human  behaviors  can  be  rendered  reliably  after  exposure  of 
the  peers  to  each  other  in  situations  which  require  the  individual  to 
Interact  either  with  tho  environment  or  with  other  people  in  work 
oriented  or  socially  oriented  situations.  Further,  it  can  be  surmised 
that  the  validity  of  the  evaluations  will  be  a function  of  the  degree 
to  which  the  particular  behaviors  are  relevant  to  the  dimension  under 
study.  Hollander  (1956)  found  that  reliable  evaluations  were  given 
after  one  hour  of  discussion  between  peers  in  a Naval  OCS  class,  but 
that  they  had  only  a moderate  degree  of  relationship  with  evaluations 
after  1 weeks,  and  were  not  ss  predictive  of  eventual  job  performance. 
This  convergence  of  views  by  peers  after  a short  period  of  exposure 
is  probably  a function  of  similar  psychological  tnnps  of  behavior  on 
the  part  of  peers,  and  the  preliminary  evaluations  on  limited  informa- 
tion are  subject  to  revision  based  upon  further  information.  There 
would  seem  to  be  little  advantage  of  one  evaluative  technique  over 
another  as  long  as  the  technique  does  not  require  the  evaluator  to 
make  finer  discriminations  than  are  possible  based  on  the  type  of 
Interaction. 
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Summary 


The  pear  evaluation  technique  has  been  used  by  researchers  both 
as  a criterion  of  complex  human  bahavior  and  as  an  index  of  future 
potential.  In  either  case,  the  particular  dimension  measured  has 
varied  considerably.  The  present  paper  reviewed  the  psychometric 
properties  and  related  research  findings  of  the  four  major  techniques 
(ratings,  rankings,  full  nominations  and  high  nominations).  Several 
important  similarities  end  differences  were  Indicated.  For  example, 
only  ratings  can  produce  comparable  scores  across  different  groups 
without  extensive  assumptions.  In  addition,  results  of  research  Indicate 
little  differences  in  measurement  reliability  between  techniques.  The 
limited  findings  also  indicate  that,  in  general,  ratings  and  rankings 
are  less  acceptable  and  less  feasible  than  either  of  the  nomination 
techniques. 

Furthermore,  a review  of  both  the  documented  and  likely  effects 
of  various  situational  factors  on  the  evaluation  process  indicated 
the  potential  for  major  problems  unless  tie  researcher  is  aware  of  the 
Issues.  While  no  direct  relationship  was  found  between  group  size  and 
reliability  or  validity  of  the  evaluations,  it  can  be  assumed  that  very 
small  or  large  groups  will  produce  less  reliable  and  less  valid  scores. 
Croup  structure  and  individual  differences  were  found  to  be  sources  of 
potential  problems  which  must  be  monitored  and  dealt  with  by  the 
researcher.  The  popular  issues  of  friendship,  acquaintanceship  and 
type  of  interaction  were  reviewed,  and  there  is  little  evidence  that 


they  have  a major  impact  on  the  validity  of  the  scores.  Indications 
ara  that  all  techniques  are  relatively  impervious  to  a variety  of 
situational  factors  with  the  nomination  technique  being  perhaps  the 
mat  versatile. 

In  brief,  it  has  been  shown  that  peer  evaluations  have  been  a 
fruitful  tool  in  both  research  and  application.  Several  issues  regard- 
ing their  use  remain  to  be  resolved,  but  there  is  sufficient  evidence 
to  suggest  that  these  issues  are  soluble  and  do  not  detract  from  the 
conclusion  that  pear  avaluations  are  a very  powerful  tool  for 
discriminating  complex  human  behavior. 


Figure  1.  Summary  of  Methodological  Issues 
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OBJECTIVE  ANALYSIS  OF  CAMOUFLAGE  VIA 
IMAGE  INTERPRETERS 


RONALD  L.  JOHNSON 

US  Army  Mobility  Equipment  Research  end  Development  Center 
Fort  Belvoir,  Virginia  22060 

ABSTRACT.  In  the  past  the  assessment  of  camouflage  effectiveness  by  its 
subjective  nature  has  been  difficult  to  objectively  quantify.  To  accomplish 
this,  63  image  interpreters  analyzed  imagery  of  a missile  site.  Subjects 
reported  which  visual  cues  enabled  site  detection  and  identification.  There 
were  63  detections  end  59  identifications  with  13  visual  cue  categories  for 
detection  and  12  for  Identification.  The  frequency  of  response  per  category 
ranged  from  41  to  1 for  detection  and  42  to  1 for  identification.  These 
frequencies  were  analyzed  by  the  statistical  technique  "Minimum  Contrasts" 
at  a level  of  significance  .05  and  ,01.  This  procedure  objectively  determined 
which  target  items  were  well  camouflaged  and  which  needed  improvement. 

I.  INTRODUCTION  . 

The  camouflage  of  military  installations  is  becoming  increasingly  critical 
as  both  ground  and  aerial  surveillance  techniques  improve.  The  goal  of  the 
camouflage  is  to  increase  the  survivability  of  the  Installation.*, , and, 
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simultaneously,  to  be  cost  effective.  There  is  always  the  need  for  a reliable 
measure  of  the  military  worth  of  camouflage.  This  cannot  be  estimated, 
however,  without  quantifying  the  effects  of  the  applied  camouflage.  In  the 
past,  this  has  been  extremely  difficult  due  to  its  inherent  subjectivity. 

The  purpose  of  this  paper  is  to  demonstrate  a method  for  that  quantification 
using  the  statistical  technique  "minimum  contrast"  to  obtain  an  item  analysis 
of  the  subjective  cues  identified  by  operational  image  interpreters, 

II.  DESIGN  OF  EXPERIMENT. 

| The  SAM  site  selected  for  experimentation  was  situated  in  a German 

agricultural  area.  Three  levels  of  camouflage  were  applied.  The  first  was 
uncamouflaged.  The  second  consisted  of  tone  down  painting  all  roads  and 
buildings,  plus  construction  of  an  adjacent  decoy  site.  The  third  level  was 
camouflaged  by  simulating  the  surrounding  agricultural  fields  and  trees. 

This  was  accomplished  by  using  camouflage  nets,  directional  plowing, 
grass  herbicide,  and  supplementary  planting  of  shrubs  and  trees.  The  decoy 
site  in  the  second  level  was  removed.  Each  of  the  three  levels  were  photo- 

! graphed  with  60%  forward  overlap  using  the  following  5"  format  Kodak  film: 

Black  and  White  Plus  X Kodak  #2402 
Normal  Color  Kodak  #2448 

Color  Infrared  Kodak  #2443 

f 

I The  resulting  imagery  was  cut  into  strips  approximately  15  frames  long,  the 

i 

ISAM  site  occupying  at  least  two  of  the  15  frames.  Sixty-three  US  Marine 
Corps  image  interpreters  were  given  thirty  minutes  to  analyze  these  film 
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strips.  Each  lovel  of  camouflage  and  each  type  of  film  were  viewed  by 
7 randomly  selected  Image  Interpreters.  Each  Interpreter  was  used  only 
once.  The  visual  cues  that  enabled  the  image  Interpreters  to  make  a 
detection  and  or  an  identification  were  recorded  on  the  data  sheet  at  the 
end  of  each  test  session, 
in.  EXPERIMENTAL  RESULTS. 

All  63  of  the  Image  Interpreters  detected  the  SAM  within  the  alloted 
30  minutes.  Fifty-nine  identified  the  site.  The  interpreters  cited  13  visual 
cues  which  contributed  to  the  site's  detection  and  12  other  visual  cues 
aiding  site  identification.  The  visual  cues  for  both  detection  and  identifi- 
cation are  extrapolations  of  specific  military  aspects  of  typical  cues  of 
psychophysical  stimuli  materials  such  as  size,  shape,  contrast,  texture,  and 
color.  The  cues  cannot  be  identified  in  this  report  due  to  security  classifi- 
cation. but  are  included  in  a confidential  report  by  the  author  V.  Tables  1 
through  7 contain  these  detection  cues  averaged  across  different  combinations 
of  camouflage  level  and  film  type.  In  addition  each  table  shows  the  frequency 
the  cue  was  reported  by  the  image  interpreters  and  which  cues  are  significantly 
different  from  each  other  at  the  .05  and  .01  level.  This  test  of  significance 
was  calculated  using  the  technique  of  "minimum  contrast"  V.  "Minimum 
contrasts"  is  a method  to  compare  two  proportions  to  determine  whether  the 
observed  contrast  is  significant  at  the  chosen  level.  The  proportions  for  this 
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TABLE  I 


Significant  Differences  in  Detection  Between  Visual  Cues  Averaged  Across 
All  Levels  of  Camouflage  and  Film  Types. 

ABC  DEFGHIJKLM  Frequency 
A 41 


B 

44 

22 

C 

** 

21 

D 

** 

19 

E 

** 

* 

* 

* 

11 

F 

** 

** 

** 

4 

10 

G 

** 

** 

** 

4 

B 

H 

44 

** 

** 

4* 

8 

I 

44 

** 

44 

44 

8 

J 

** 

** 

** 

44 

6 

K 

** 

** 

** 

44 

5 

L 

** 

44 

** 

44  4 4 

3 

M 

** 

44 

* * 

44  44  44  4 4 4 

1 

Cell  Size  « 63 

* > Significant  Difference  at  a ® .05 
**■»  Significant  Difference  at  a * .01 

TABLE  2 

Significant  Differences  in  Detection  Between  Visual  Cues  Averaged  Across 
Film  Types,  Uncamouflaged  Level. 

ABC  DEFGHI  JKLM  Frequency 


A 13 

B 10 

C 7 

D **  4 

E **  - 2 

F **  - 3 

G ****_.  1 

H **  - 3 

1 **  4 

J **  4 

K **  3 

L **  - 3 

M **  44  — j 


Cell  Size  = 21 

- = Border  Line  Significance  at  a « ,05 
*»  Significant  Difference  at  « = .05 
**  = Significant  Difference  at  a = .01 
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TABLE  3 


F 


Significant  Differences  in  Detection  Between  Visual  Cues  Averaged  Across 
Film  Types,  Tone-Down  Plus  Decoy  Level, 


A 

B 

C 

D E 

F 

G 

H 

I 

J K 

L 

M 

Frequency 

A 

14 

B 

- 

6 

C 

- 

6 

D 

10 

E 

#* 

** 

2 

F 

* 

- 

5 

G 

** 

* 

3 

H 

** 

** 

2 

I 

** 

** 

1 

J 

** 

* 

* 

** 

- 

0 

K 

** 

*+ 

1 

L 

★* 

* 

* 

#* 

- 

0 

fyj  *1 »■  ****-.  g 

Cell  Size  * 21 

- * Border  Line  Significance  at  a «*  ,05 
* «*  Significant  Difference  at  a » ,0fi 
**  “Significant  Difference  at  a a ,01 

TABLE  4 

Significant  Differences  in  Detection  Between  Visual  Cues  Averaged  Across 
Film  Types,  Full  Camouflage  Level, 

ABC  DEFGHIJKLM  Frequency 


A 14 

B - G 

C 8 

D * 5 

.E  * 6 

F **  2 

G **  4 

H **  3 

I **  3 

J **  2 

K **  2 

L **  * * - * 0 

M'  **  * * - * 0 


Cell  Size  - 21 

- * Border  Lino  Significance  at  a = .05 
* « Significant  Difference  at  a = ,05 
* ■'  ^ Significant  Difference  at  a - .01 
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TABLE  5 


I 


Significant  Differences  in  Detection  Between  Visual  Cues  Averaged  Across 
Camouflage  Levels , Film  Type  - B&W  Plus  X. 

ABC  DEFGHIJKLM  Frequency 


A 16 

B * 7 

C * 6 

D **  4 

E **  2 

F **  2 

G **  3 

H **  5 

I **  - 1 

J **  - 1 

K **  2 

L **  - 1 

**  **  **  0 


Cell  Size  - 21 

- = Border  Line  Significance  at  « = .05 
* » Significant  Difference  at  « » .05 
**  ■ Significant  Difference  at  « = .01 

TABLE  6 


Significant  Differences  in  Detection  Between  Visual  Cues  Averaged  Across 
Camouflage  Levels,  Film  Type  - Color. 


ABC  D E F 
A 
B 
C 
D 


E 

** 

F 

** 

G 

** 

H 

** 

** 

** 

*4t 

I 

* 

J 

** 

* 

K 

** 

** 

- 

- 

L 

** 

** 

** 

** 

M 

** 

** 

— 

.. 

Cell  Size  * 21 

- = Border  Line  Significance  at 
* - Significant  Difference  at  <*  = 
**  - Signicont  Difference  at  “ = 


G H 


a = ,05 
.05 
.01 
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K L M 


Frequency 

13 

9 

7 

7 

4 

3 

4 
0 

5 
2 
1 
0 
1 


% 


TABLE  7 


Significant  Difference  in  Detection  Between  Visual  Cues  Averaged  Across 
Camouflage  Levels,  Film  Type  - Color  IR, 


ABC  D EFGHIJKLM  Frequency 


A 

B 

C 

D 

E 

P 

G **  * * 

H ** 

I ** 
j ** 

K ** 

L ** 

M **  * **  ** 


12 

e 

8 

8 

S 

S 

1 
3 

2 
3 
2 
2 
0 


Cell  Size  ■ 21 

- ■ Border  Line  Significance  at  a ■ .05 
* - Significant  Difference  at  a • .05 
**  * Significant  Difference  at  a ■ .01 

Tables  8 through  14  contain  the  12  visual  cues  which  contributed  to  site 
identification  averaged  across  different  combinations  ofcamouflage  and 


film  type.  Cue  frequency  and  significance  at  a * .05  are  again  Included 
as  in  the  preceding  tables.  As  before,  the  cues  cannot  bo  identified 


because  of  security  classification. 
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TABLE  8 


Significant  Differences  in  Identification  Between  Visual  Cues  Averaged 
Across  All  Levels  of  Camouflage  and  Film  Types. 

ABC  DEFGHIJKL  Frequency 
A 42 


B 

** 

25 

C 

** 

15 

D 

** 

* 

13 

E 

** 

** 

8 

F 

A* 

** 

8 

G 

** 

** 

8 

H 

** 

** 

it 

* 

4 

I 

** 

** 

** 

** 

2 

J 

** 

** 

** 

** 

2 

K 

** 

** 

** 

***** 

1 

L 

** 

** 

Hit 

**  * * ★ 

1 

Cell  Size  **  59 

* = Significant  Difference  at  a = .05 
**■>  Significant  Difference  at  a =»  .01 

TABLE  9 

Significant  Differences  in  Identification  Between  Visual  Cues  Averaged 
Across  Film  Types,  Uncamouflaged  Level. 

ABC  DEFGHIJKL  Frequency 
A 17 


B 

** 

8 

C 

** 

2 

D 

** 

7 

E 

** 

5 

F 

** 

6 

G 

** 

4 

H 

** 

2 

I 

** 

★ 

* 

1 

J 

*★ 

* 

* 

1 

K 

** 

** 

**  * 

* 

0 

L 

*+ 

** 

**  * 

* 

0 

Ceil  Size  = 17 

* = Significant  Difference  at  a = .05 
**-  Significant  Difference  at  a = .01 


TABLE  10 


Significant  Differences  in  Identification  Between  Visual  Cues  Averaged 
Across  Film  Types,  Tone-Down  Plus  Decoy  Level. 

ABC  DEFUH1IKL  Frequency 


A 12 

B 6 

C 7 

D * 4 

E **  2 

T **  * 1 

G **  2 

II  **  * 1 

I **  # **  0 

j **  * 1 

k **  * 1 

^ **  * **  o 

Cell  Size  = 17 

* = Significant  Difference  at  « * .05 
**»  Significant  Difference  at  a = .01 


TABLE  11 

Significant  Differences  in  Identification  Between  Visual  Cues  Averaged 
Across  Film  Types,  Full  Camouflage  Level. 
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1 


... 


TABI.C  12 

Significant  Differences  in  Identification  Between  Visual  Cues  Averaged 
Across  Camouflage  Levels,  Film  Type  - B&W  Plus  X. 


A 

A 

B 

B 

* 

C 

** 

D 

★ * 

E 

** 

F 

** 

G 

★ * 

* . 

H 

** 

I 

** 

** 

J 

** 

** 

K 

** 

■** 

L 

** 

** 

Cell  Size  * 

G 11  I 


J 


Frequency 

IB 

8 

2 

4 

3 

3 

1 

2 

0 

0 

0 

0 


17 

* « Significant  Difference  at 
Significant  Difference  at 


a = .05 

a = ,01 

TABLE  13 


Significant  Differences  in  Identification  Between  Visual  Cues  Averaged 
Across  Camouflage  Levels,  Film  Type  - Color, 


D E F 


H I 


J K L 


A 

B 

C 

D 

E 

F 

G 

II 

I 

I 

K 

L 


* 

* 

* 

** 

** 

** 

** 

** 


*• 

* 

* 

** 
«* 
★ * 
* * 
** 


Frequency 

10 

10 

4 

f» 

2 

3 

2 

1 

1 

1 

1 

1 


Coll  Size  - 17 

* = Significant  Difference  ot  a =»  .05 
**  = Significant  Difference  at  01  ~ .01 
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TABLE  14 

Significant  Difference*  in  Identification  Between  Visual  Cues  Averaged 
Across  Camouflage  Levels,  Film  Type  - Color  IR. 


D E F G H I 


A 

B ** 

C ** 

D ** 

E ** 

F ** 

G ** 

H **  * 

I **  * 

j **  * 

K **  ** 

L **  ** 


Frequency 

17 

7 

9 

4 
3 
2 

5 
1 
1 
1 
0 
0 


Cell  Size  - 17 

* **  Significant  Difference  at 
**a  Significant  Difference  at 


IV.  DISCUSSION . 

A review  of  tables  1-7  demonstrates  that  the  isolation  of  the  critical 
visual  cues  for  site  detection  was  accomplished  by  the  use  of  "minimum 
contrasts."  Detection  cue  A was  a significant  factor  in  all  tables  for  the 
detection  of  the  SAM  site.  There  was  virtually  no  change  in  the  importance 
of  this  cue  in  site  detection  when  analyzed  across  levels  of  camouflage 
and  film  type.  Therefore, more  effort  must  be  expended  to  prevent  this  cue 
from  becoming  a major  factor  in  target  detection.  The  addition  of  the  decoy 
site  adjacent  to  the  SAM  site  had  a pronounced  effect  in  increasing  the 
number  of  significant  cues  that  allowed  the  image  interpreter  to  detect  the 
site  (table  3).  Visual  cues  E and  F,  and  H through  M did  not  have  much 
effect  on  site  detection  either  for  level  of  camouflage  or  type  of  film  analyzed. 
The  number  of  cues  leading  to  site  detection  was  greater  for  the  color  and 
color  infrared  film  than  for  the  black  and  white  film  (tables  5-7).  As  is  well 
known,  more  information  is  presented  to  the  image  interpreter  on  color  and 
color  Infrared  film  than  on  black  and  white  imagery. 

Tables  8-H  indicate  that  the  use  of  "minimum  contrasts"  to  isolate 
the  critical  visual  cues  in  the  identification  of  the  SAM  site  was  successful. 
Visual  cues  important  for  site  identification  %vere  different  from  those  for  site 
detection.  Identification  cues  A ami  B were  the  most  important  except  for 
camouflage  level  two  containing  tone-down  and  site  decoy.  For  this 
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case , cues  A and  C were  the  most  prominent  in  site  identification  (table  10) . 
The  effects  of  visual  cue  C were  essentially  nil  for  levels  one  and  three 
(Tables  9 and  11) . Visual  cues  D through  L had  little  effect  on  site  identi- 
fication when  analyzed  by  level  of  camouflage  or  type  of  film.  Color  infrared 
film  generated  more  visual  cues  to  target  identification  (Table  14  than  both 
color  and  black  and  white  films  (Tables  12-13) . We  consider  this  to  be 
due  to  the  greater  amount  of  information  presented  to  the  image  interpreter 
with  color  Infrared  film  than  for  the  other  two  film  types . The  results 
indicated  that  this  approach  was  a valid  method  to  objectively  analyze 
subjective  cues. 

V.  SUMMARY  AND  CONCLUSIONS  . 

The  problem  faced  in  this  study  was  to  objectively  analyze  the  effects 
of  levels  of  camouflage  on  detection  and  identification,  A SAM  site  was 
selected  and  photographed.  Subjective  visual  cues  were  elicited  from 
operational  image  Interpreters  in  response  to  specially  prepared  classified 
packets  of  site  photography.  These  cues  for  both  detection  and  identifi- 
cation were  grouped  into  categories  and  analyzed  for  significance  using  the 
technique  of  "minimum  contrasts" , This  technique  facilitated  the 
quantification  of  the  subjective  cues  used  by  image  interpreters  in  site 
detection/identlflcation  for  levels  of  camouflage  and  types  of  photographic 
film. 
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A SIMPLE  METHOD  FOR  DETERMINING  THE 
UNRESTRICTED  AVERAGE  OUTGOING  QUALITY 
LIMIT  (UAOQL)  OF  A CONTINUOUS  SAMPLING  PLAN 
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Rock  island , Illinois 


ABSTRACT.  This  paper  provides  a simple  algorithm  for 
determining  the  unrestricted  Average  Outgoing  Quality 
Limit  (UAOQL)  of  a continuous  sampling  plan.  The  deriva- 
tion of  the  algorithm  is  shown. 

1.1  INTRODUCT ION . As  a prerequisite  to  a discussion 
of  the  UAOQL , some  review  of  the  fundamentals  of  continu- 
ous sampling  is  in  order. 

Most  statisticians  are  familiar  with  the  concept  of 
sampling  from  a lot.  For  example,  we  might  have  a lot  of 
one  hundred  items,  from  which  a sample  of  size  seven  has 
been  drawn.  The  acceptance  decision  for  the  lot  will  be 
based  on  the  results  found  in  the  sample.  For  example, 
the  rules  of  the  sampling  plan  we  are  using  might  say  that 
if  two  or  fewer  units  out  of  the  sample  of  size  seven  are 
defective,  we  shall  aocept  the  lot.  If  three  or  more  units 
are  defective,  we  shall  reject  the  lot. 

Under  continuous  sampling,  we  do  not  use  the  concept 
of  sampling  a certain  number  of  units  from  a lot  of  material. 
Instead,  we  carry  out  inspection  as  the  units  are  produced 
and  flowing  along  the  production  line. 

The  prerequisites  for  using  a continuous  sampling  plan 

aret 

a.  Moving  product. 

b.  Ample  physical  facilities  for  100%  inspection  when 
necessary. 

c.  Relative  ease  of  inspection 

d.  A process  capable  of  producing  homogeneous  material. 
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An  example  of  a continuous  sampling  plan  is  Harold 
Dodge's  CSP-1  [2].  Dodge  was  the  original  developer  of 
continuous  sampling  plans , and  published  his  first  work 
on  the  subject  in  1943.  Under  CSP-1,  at  the  start  of 
inspection , the  screening  crew  inspects  100%  of  the  units. 

When  some  prespecified  number,  i,  of  consecutive  units  are 
free  of  the  defects  concerned,  that  is,  the  defects  for 
whiah  we  are  inspecting,  the  screening  crew  is  released 
from  100%  inspection  and  the  sampling  inspector  inspects 
a prespecified  fraction,  f,  of  the  units,  where  the  sample 
units  are  selected  in  a random  manner  as  they  pass  the  ! 

point  of  inspection.  If  a defective  unit  is  found,  100%  1 ■ - 

inspection  is  resumed,  and  the  cycle  repeats  itself  as 
necessary  during  the  remainder  of  production. 

We  made  mention  of  the  values  of  i and  f,  which  are 
specified  for  each  individual  CSP-1  plan.  For  example, 
we  might  have  a clearance  number,  i,  of  twenty  units,  and 
a sampling  frequency,  f,  of  one  in  ten. 

Some  of  the  functional  properties  of  a CSP-1  plan 
(or  any  CSP  plan  for  that  matter)  that  are  usually  of 
interest  to  the  statistician  are  the  following i 

a.  The  Average  Fraction  Inspected,  of  AFI,  which  is 
the  expected  value  of  the  fraction  of  material  that  will 
be  inspected  over  an  indefinitely  long  period  of  time  when 
eaah  unit  has  probability  p of  being  defective. 

b.  Th®  Average  Outgoing  Quality,  or  AOQ,  which  is 
the  expected  fraction  of  material  that  is  defective  in 
acaepted  material  over  an  indefinitely  long  period  of  time 
when  each  unit  has  probability  p of  being  defective. 

c.  The  Average  Outgoing  Quality  Limit,  or  AOQL, 
which  is  the  maximum  value  of  AOQ. 

Thus  far,  we  are  talking  about  properties  based  on 
the  assumption  that  each  unit  has  probability  p of  being 
defective.  This  is  of  course  a very  restrictive  assump- 
tion, sinoe  one  might  intuitively  feel  that  in  the  real 
life  situation,  p would  undergo  some  sort  of  variation 
over  time.  For  this  reason,  statisticians  concerned  them- 
selves with  the  problem  of  how  to  describe  the  mathematical 
properties  of  continuous  sampling  plans  when  p varied  over 
time.  In  19S3,  Lieberman  [4]  presented  an  analysis  of 
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CSP-1  under  the  assumption  that  p was  not  constant  for  eaoh 
unit*  He  determined  that  the  worst  situation  would  be  the 
one  where  only  good  units  reached  the  inspector  during 
phases  of  100%  inspection , and  only  bad  units  reached  the 
inspector  during  phases  of  sampling  inspection. 

The  outgoing  quality  reflected  by  this  worst  possible 
situation  eventually  came  to  be  called  the  Unrestricted 
Average  Outgoing  Quality  Limit,  or  UAOQL*  There  is  a very 
interesting  paper  on  the  UAOQL  by  Sackrowitz  [5]  in  the 
April  1975  Journal  of  Quality  Technology;  however, 

Sackrowitz* s def iniETons  are  somewhat  different  from  what 
we  will  discuss  here. 

There  are  two  general  cases  that  we  will  consider i 
that  situation  where  defective  units  found  are  removed  from 
the  flow  of  product  and  replaced  with  good  units,  and  the 
situation  where  defective  units  found  are  removed  from  the 
flow  of  product  but  are  not  replaoed  with  good  units. 

For  the  replacement  case.  White  [6]  carried  out  a 
quite  complex  derivation  involving  linear  programming  to 
show  that  for  a broad  class  of  plans,  the  UAOQL  would  re- 
sult from  that  situation  where  for  any  phase  of  inspection 
of  a plan,  either  all  good  units  are  submitted  during  every 
occurrence  of  the  phase  or  all  bad  units  are  submitted 
during  every  occurrence  of  the  phase.  White  [7]  computed 
numerical  solutions  for  plans  from  DOD  Handbook  H106. 

Endres  13] , an  employee  of  mine  at  the  time,  showed  that 
this  rule  would  apply  also  in  the  case  where  defective  units 
were  removed  from  the  flow  of  product,  but  were  not  replaced 
with  goovl  units. 

2.  DISCUSSION.  With  the  difficult  mathematical  proofs 
thus  out  of  the  way,  the  possibility  of  developing  a simple 
algorithm  presented  itself  [1] . The  phases  of  inspection 
could  be  treated  as  states  in  a Markov  chain.  Remember  that 
the  UAOQL  will  result  from  a situation  where  for  every  occur- 
rence of  each  phase,  either  only  all  nondefectives  are  sub- 
mitted for  inspection,  or  only  all  defectives  are  submitted 
for  inspection. 
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Let  us  define  configurations  to  be  the  values  of 

y « <^(1) , ...»  ^Oc)) , 

where 

k ■ number  of  states 

4<j)  m o if  during  occurrences  of  the  phase  represented 
by  state  j only  nondef natives  are  submitted  for 
inspection. 

4 ( j ) - 1 if  during  occurrences  of  the  phase  represented 
by  state  j only  defectives  are  submitted  for 
inspection . 


It  is  dear  that  for  any  plan  of  the  type  we  are  considering , 
then i there  will  be  2K  configurations.  For  even  moderate 
aiaed  values  of  k,  the  problem  could  be  difficult  if  we  had 
to  consider  every  configuration.  Fortunately,  we  can  make 
the  problem  smaller. 

Let  us  first  go  through  the  case  where  defective  units  are 
removed  and  replaced  with  good  units. 

Theorem  It  If  a configuration  exists  such  that  for  any 

state  j 

(i)  ^(j)  - 0,  and 

(ii)  State  j is  an  absorbing  barrier, 

then  this  configuration  need  not  be  considered  in  deter- 
mining the  UAOQL. 

Proof i Under  the  conditions  stated  in  the  theorem,  the 
long  run  outgoing  quality  would  be  tero. 

Thoorsm  2 i If  a configuration  exists  such  that  for  any 
state  j 

(i)  t<j)  - 1,  and 

(ii)  State  j is  an  absorbing  barrier, 

then  this  configuration  need  not  be  considered  in  deter- 
mining the  UAOQL. 
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Proof  * Under  the  eonditione  stated  in  the  theorem#  the 
long  run  outgoing  quality  would  be  zero. 


We  thus  see  that  all  configurations  involving  absorbing 
barriers  can  be  disregarded. 

Consider  CSP-1.  Let  state  1 be  the  100%  inspection  state 
and  state  2 be  the  sampling  state.  We  have  the  following 
conf igurations i 

yx  - (0#  0) 

y - (0,  1) 

2 

y3  - dr  o) 
y4  - dr  i) 

Configurations  with  i(l)  ■ 1 or  ^(2)  ■ 0 can  be  disregarded, 
since  these  would  represent  absorbing  barrier  situations. 
Therefore  yi,  y3,  and  y4  can  be  disregarded.  The  remaining 
configuration,  y2,  represents  the  situation  under  which  the 
UAOQL  occurs)  no  defective  units  are  submitted  during  periods 
of  1001  inspection,  only  defective  units  are  submitted  during 
periods  of  sampling  inspection. 

Let  us  now  define  another  term. 

A sequence  of  states  which  repeats  itself  indefinitely  under 
the  conditions  imposed  shall  be  called  a cycle.  For  example, 
if  a Markov  chain  consists  of  four  states,  abd  if  a configura 
tion  results  in  a sequence  of  states  (1,  2,  3,  4,  3,  4,  3, 

4 ...),  then  (3,  4)  is  a cycle. 

Theorem  3i  The  long  run  outgoing  quality  for  a configura- 
tion  involving  aycles  is  equal  to  the  average  number  of 
defectives  passed  in  a cycle  divided  by  the  average  number 
of  units  passed  in  a cycle. 

Proof t The  long  run  outgoing  quality  is 
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!■!  defectives  passed  in  cycle 
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i«l  units  passed  in  cycle  i 
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Considering  CSP-1  again,  it  has  baan  shown  that  only  configura- 
tion Y2  " (0#  1)  naad  ba  considered.  Sinoa  tha  aaquance  of 
atataa  (1,  2.  1,  2,  . ..)  occur#,  we  may  rafer  to  (1,  2)  aa 
a oyola.  Uaing  Thaoram  3,  wa  may  than  aay  that 
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AVERAGE  NUMBER  OF  DEFEC- 
TIVES PASSED  DURING  100 t 
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whare  i ia  tha  langth  of  100%  inapection  and  f ia  the 
aampling  frequency.  Lat  ua  now  eonaider  the  caaa  whare 
defaotiva  units  found  are  removed  but  not  replaced  with 
good  unita. 


Theorem  1 » i If  a configuration  exiata  such  that  for  any 
atate  j 

(i)  $(j)  - 0,  and 

(ii)  State  j ia  an  absorbing  barrier, 

then  thia  configuration  need  not  be  considered  in  determining 
the  UAOQL. 

Proofs  Under  the  conditions  stated  in  the  theorem,  the 
long  run  outgoing  quality  would  be  zero. 

We  see  that  this  is  the  same  as  Theorem  1 for  the  replace- 
ment case. 

Theorem  2't  If  a configuration  exists  such  that  for  atate 

1 (corresponding  to  the  first  phase  encountered 
in  uaing  the  sampling  plan) 

(i)  £(1)  ■ 1,  and 

(ii)  State  1 is  an  absorbing  barrier, 

Then  thia  configuration  need  not  be  considered  in  deter- 
mining the  UAOQL. 
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Proof i Under  the  conditions  stated  in  the  theorem,  no 
product  would  be  passed  at  all,  hence,  outgoing  quality  would 
not  be  defineable. 

Theorem  3 1 « If  the  number  of  units  passed  in  a cyole  is 

greater  than  zero,  then  the  long  run  outgoing 
quality  for  a configuration  is  equal  to  the 
average  number  of  defectives  passed  in  a cycle 
divided  by  the  average  number  of  units  passed 
in  a cycle. 

Proofs  Same  as  Theorem  3 for  the  replacement  case. 

Theorem  4 1 i If  a cycle  passes  zero  units,  it  is  only  neces- 
sary, in  determining  the  long  run  outgoing 
quality,  to  consider  those  Btatea,  if  any, 
which  occur  before  cycling  begins. 

Proofs  The  fraction  defective  of  material  passed  by  the 
inspection  system  would  remain  unchanged  once  cycling  begins, 
since  no  more  units  would  be  passed.  This  theorem  is  useful 
when  a lOOt  inspection  state  other  than  state  1 is  an 
absorbing  barrier. 

As  an  example,  let  us  consider  the  simple  case  of  CPS-1 
again  under  the  nonreplacement  assumption.  We  have  the 
configurations 

y^  - (0,  0) 
y2  - (o,  l) 
y3  - (l#  o) 
y4  - (i,  i) 


Configurations  with  <i(l)  ■ 1 or  |(2)  ■ 0 can  again  be  dis- 
regarded, since  these  would  represent  absorbing  barrier 
situations  with  no  defective  units  passing.  Again  y2  * 

(0,  1)  is  the  only  configuration  that  need  be  considered. 
In  the  replacement  case,  then. 


UAOQL 


AVERAGE  NUMBER  OF  DEFEC- 
TIVES PASSED  DURING  100% 

xvHOBffi' " MKffiBirar  wi  w ~ 

PASSED  DURING  100) 


AVERAGE  NUMBER  OF  DEFEC- 
TIVES PASSED  DURING  SAMPLING 
A\/ERAdE  NUMBER  OF  UNITS 
PASSED  DURING  SAMPLING 


0 + (- 
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In  our  examples,  we  have  used  the  ainpleet  case,  CSP-1. 
However,  in  practice,  we  have  found  that  we  oan  uae  thia 
method  for  plana  of  aome  complexity  in  order  to  determine 
the  UAOQL  for  either  the  replacement  or  the  nonreplacement 

caae. 
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ABSTRACT . Given  a Markov  Chain  (MC)  model  for  a particular  Continuous 
Sampling  Plan  (CSP),  a method  of  restructuring  lta  states  into  a simpler 
Semi  Markov  Chain  (SMC)  pattern  is  used  to  analyse  MC  functionale  which 
are  partially  defined  by  random  backward  shlfta  in  operational  time. 

Specifically , the  usual  MC  model , for  the  Job  Shop  case  of  CSP-1, 
initially  starts  with  an  inspection  phase  of  I states  and  thereafter  cycli- 
cally alternates  between  it  and  a sampling  phase.  However,  whenever  sampling 
is  terminated,  this  plan  is  modified  by  the  additional  requirement  of  a 
(limited)  Downstream  Inspection  (DSI)  of  the  previous  I units  followed  by  a 
phase  transition  determined  by  the  outcome  of  such  an  inspection.  For  a pro- 
duction run  of  length  N,  this  modification  induces  a corresponding  one  in  the 
expected  value  of  the  associated  basic  functional:  Fractlo-  Inspected  [FI(N;1)J. 

Both  modifications  are  handled  here  by  1.)  slightly  changing  the  usual  SMC 
reduction  and  2.)  coupling  this  change  with  a new  functional:  Incremented 

Fraction  Inspected  [IFI(N;2)].  The  expected  value  of  the  functional  Total 
Fraction  Inspected  (TFI(N;2)j  la  then  expressed  as  the  expected  value  of  the 
sum  of  two  terms:  the  new  functional  and  the  (unmodified)  functional,  FI(N;2), 

defined  on  the  altered  SMC.  In  addition  to  comparing  the  long  run  expressions 
for  TFI  and  AFI,  a comparison  is  also  made  between  TFI  and  the  expression  which 
results  from  the  more  familiar  requirement  of  (limited)  Upstream  Inspection 
(USI). 

In  analyzing  the  above  situation  for  finite  N,  two  interpretations  of 
DSI  are  subsequently  studied.  The  first,  based  on  possible  inspection  or 
manufacturing  irregularities  in  both  phases,  Is  the  scheme  already  referred 
to  above.  The  second,  based  only  on  the  putative  assumption  of  sampling  phase 
irregularities,  is  a less  strict  version.  For  N infinite,  a comparison  is  made 
between  the  expected  values  of  the  two  TFI’s. 

Since,  in  either  of  the  two  plans,  TFI  does  not  explicitly  take  account 
of  multiple  inspections  of  the  same  unit , other  measures  of  plan  performance 
are  considered  which  do.  To  this  end,  the  paper  concludes  with  a study  of  the 
functional  Fraction  of  Repetitions  (FR),  its  first  moment,  and  a variant  func- 
tional. In  order  to  deal  with  this  functional,  further  modifications  in  the  SMC 
model  for  CSP-1  are  necessary. 
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1.0  BACKGROUND . 

1.1  Introduction.  The  principal  aubjact  of  thla  paper  la  the  etudy  of 

variations  in  one  member  of  a class  of  sampling  plans  and  functionals  de- 
fined on  these  variations.  The  class  referred  to  is  that  of  certain 
Continuous  Sampling  Plans  (CSP)  which  are  treated  as  finite  state , irre- 
ducible, time  homogeneous,  and  aperiodic  Markov  ChainB  (’1C) . The  element 
referred  to,  classically  denoted  by  CSP-1,  is  the  simplest  element  of  thla 
class.  In  dealing  with  these  MC  models,  four  different  kinds  of  standard 
groupings,  called  phases,  of  their  states  ran  be  distinguished:  screening 

(sc),  unlimited  sampling  (ula),  limited  sampling  (Is),  and  checking  (ck). 
Using  the  terminology  of  phases,  a given  CSP  can  then  be  defined  as  a 
collection  of  two  or  more  different  phases  (normally,  one  of  which  is  sc) 
which  are  linked  together  in  accordance  with  sampling  frequency  criteria. 
Throughout  the  bulk  of  the  paper,  only  the  two  canonical  phases  making  up 
CSP-1  will  be  considered:  interest  will  be  especially  focusod  on  structural 
changeo  in  uls  which  are  brought  about  by  Downstream  Inspection  (DS1) , At 
the  end  of  Chapter  3,  the  checking  phase  will  also  be  briefly  considered 
since  it  can  be  regarded  as  Upstream  Inspection  (USI) . 

\ ' 

CSP-1  and  the  major  variation  in  it,  brought  about  by  DSI,  are  portrayed 
in  Figure  1. 


Figure  1 
CSP-1  and  DSI 
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In  Figure  1,  C8P-1  conslete  of  the  top  two  boxes  connected  together 
with  the  solid  lines.  The  DSI  plan,  denoted  by  CSP-12,  la  obtained  from 
CSP-1  by  replacing  the  top  solid  line  by  the  dotted  ones  and  adding  the 
lower  box.  Two  approaches  will  be  used  to  handle  this  change. 

The  first  approach,  given  In  Chapters  2 and  3,  consists  in  counting 
only  the  extra  units  inspected  without  regard  to  any  inspection  repetitions 
due  to  DSI.  In  the  second  approach,  given  In  Chapter  4,  all  units  Inspected 
are  also  counted,  but  now  including  repetitions.  Both  approaches  use,  as 
the  main  tool.  Semi  Markov  Chain  (SMC)  reduction  of  MC  models  which  is  now 
briefly  described. 

In  describing  SMC  reduction,  the  term  macrostate  will  be  used  to  refer 
to  an  ensemble  of  MC  states  which  is  structured  as  a (discrete)  SMC  state 
(e.g.,  a canonical  phase  of  a CSP).  To  be  a macrostate,  an  ensemble  must 
satisfy  two  conditions.  1.)  The  MC  probability  of  entrance  vector  (pev) 
Into  the  ensemble,  given  that  such  an  entrance  occurs,  must  be  stationary 
and  Independent  of  the  state  from  which  the  entrance  is  made.  In  other 
words,  letting  the  ensemble  S be  composed  of  the  k MC  states,  j,  1 1 j 1 k, 
we  impose  the  condition  that,  for  an  arbitrary  time  n, 


v(n)  * v 


where 

v(n)  - (vj (n) . v2(n),  , vk(n)) 

vj(n)  » P[M(n)  ■ j|M(n)  in  S,  M(n-l)  not  in  S) 

and 

M(«)  is  the  MC  process. 

2.)  Subject  to  the  restrictions  of  1.)  for  a given  target  macrostate,  an 
exit  can  occur  from  a MC  state  of  the  ensemble  into  a MC  state  of  the 
macrostate  only  if  the  first  state  communicates  with  the  second  in  the 
underlying  MC.  To  avoid  a circular  construction,  we  finally  note  that 
any  MC  state  is,  itself,  a (trivial)  macrostate. 

Two  different,  but  equivalent,  methods  can  be  used  to  construct  such 
macrostates  from  MC  states:  the  MC  method,  which  is  pedestrian,  but  straight- 

forward, and  the  SMC  method,  which  is  more  subtle  but  nearer  to  the  general 
idea  of  SMC  reduction.  Under  either  method,  MC  functionals  induce  well  defined 
SMC  ones  and  the  MC  properties  of  time  homogeneity,  irreducibility , and 
aperiodiclty  are  preserved  [cf.,  6.2  and  6.8]. 
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In  the  MC  method,  the  component  at ACM  of  a given  aaerostata  and 
the  posalbla  axle  aacrostatas  are  the  transient  states  and  the  absorbing 
states,  respectively,  of  an  absorbing  MC  which  Is  derived  froai  a parti- 
tioning of  the  original  MC.  The  possibly  defective  probability  density 
function  (pdf)  of  a transition  of  the  nacroetate  to  any  one  of  the  target 
aacrostatas  is  than  just  the  weighted  sun  of  the  first  entry  probability 
functions,  each  weighed  by  the  component  In  the  etatlonary  pav.  In  the 
■ore  constructive  SMC  aethod,  a given  parent  nacroetate  Is  considered  to 
be  made  up  of  two  or  more  smaller  aacrostatas  (Including  a MC  state  with  : 

or  without  self  tranaltlons) . To  such  a division,  the  "MC  aethod"  is 
applied,  only  now  to  an  absorbing  SMC.  The  derived  system  of  Backward 
Squat lone  (see  A. 21),  or.  In  slapler  situations,  direct  combinatorial 
analysis  Is  then  applied  to  obtain  the  resulting  first  entry  SMC  probabil- 
ity functions.  Their  weighted  sum,  again  weighed  by  components  of  the 
(induced)  stationary  pev,  yields  the  pdf  of  the  parent  aaerostata  (to  some  i 

one  target  aaerostata).  This  latter  method  is  easier  to  use  and  Intuitively  ! 

more  appealing;  it  will  be  used  almost  exclusively  throughout  this  paper  1 

except  for  a simple  example  of  the  MC  method  given  at  the  end  of  Chapter  1.  ! 

Furthermore,  the  SMC  method,  at  any  stage  in  its  use,  emphasises  the  concepts  I 

1.)  of  constructing  from  a given  MC  a class  of  SMC's  which  is  partially  j 

ordered  by  filtration  [6.2  and  6.7}  and  2.)  of  using  different  elements  of  1 

this  class  to  attack  either  different  problems  or  different  stages  of  one 
problem  which  arise  from  the  original  MC.  j 

Neither  of  these  two  methods  should  be  confused  with  the  process  of 
lumping  as  it  is  defined  in  [6.13].  In  fact,  for  CSP's,  It  Is  not  possible 
to  lump  the  states  in  each  phase,  in  the  above  sense,  into  a new  MC  state. 

A more  thorough  presentation  of  SMC  reduction,  with  many  appllcatlona , can 
be  found  in  [6.2).  What  notation,  definitions,  and  theorems  concerning  SMC's 
that  are  needed  in  this  paper  are  taken  from  this  reference  and  can  be  found 
in  the  Appendix.  A more  heuristic  approach  to  SMC  reduction  (for  the 
stationary  case)  together  with  further  applications  can  be  found  in  [6.4, 

6.3,  and  6.6]  where  it  is  called  The  Simplified  Markov  Chain  Method. 

In  nummary,  the  MC  method  can  be  stated  as  follows.  Given  the  compo- 
nents, v j , of  the  pev  and  the  MC  first  entrance  probability  function 


from  j to  a target  macrostate  (absorbing  state)  A,  the  equation  for  the  pdf 
from  S to  A is  (see  Appendix  for  notation) 

k 

<Wn>  " Z)  vjfJ,A  • (AD 

j-1 


3 
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Similarly,  tha  SMC  method  leads  to  the  aame  form  for  the  RH8  of  Eq.  A1 
in  which  the  f'e  ere  replaced  by  SMC  first  entrance  probability  functions. 

Another  ubiquitous  tool,  used  in  concert  with  SMC  reduction,  Is  the 
(-transform.  The  transform  la  applied  here  to  probability  sequences 
rather  than  to  the  transitional  matrices  themselves.  This  approach  is 
taken  because,  in  practical  applications,  the  ranks  of  the  matrices  are 
quite  large  (about  3 x 102  and  greater).  Thus  the  ranks  of  the  complex 
functional  matrices,  obtained  via  the  transform,  would  be  so  large  that 
1.)  important  relationships  would  be  obscured  and  2.)  an  analysis  of  them 
would  be  almost  as  difficult  as  that  done  without  the  transform.  The 
salient  features  of  the  transform  can  be  found  in  [6.3  and  6.12].  We  re- 
cord here  only  some  basic  notation  that  will  be  used  with  sequences  treated 
as  functions  from  the  natural  numbers  to  the  reals.  Given  a sequence  a(n), 
a(z)  is  its  s-transform.  Given  sequences  a(n)  and  b(n),  a*b(n)  is  their 
convolution.  6n(k)  denotes  the  (Dirac)  sequence  which  is  one  for  the 
argument  equal  to  n and  zero  otherwise;  Jn(z)  ■ l/sn.  H^k)  denotes  the 
(Heaviside)  sequence  which  is  one  for  the  argument  greater  than  or  equal 
to  n and  zero  otherwise;  &„(()  ■ (L/zn) (z/(z-l)) . 

1.2  SMC(l)  and  FI(N;1).  The  basic  premise  used  in  modelling  a CSP  is 
that  the  underlying  production  process  is  a Bernoulli  process  with  a con- 
stant probability  of  defective  p (and  probability  of  non-defective  q ■ 1-p) . 
In  particular,  the  MC  structure  of  CSP-1,  which  is  fully  described  in  [6.1, 
6.2,  and  6.4],  arises  from  the  sequential  sampling  Bcheme  imposed  on  the 
above  process  with  an  operational  time  defined  by  the  flow  of  non-repeating 
production  units.  The  SMC  model  of  CSP-1,  derived  from  the  MC  model,  is 
given  in  Figure  2 and  is  denoted  by  SMC(l). 

Figure  2 

SMC  Model  of  CSP-1  (SMC(l)) 
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For  the  model  In  Figure  2,  I ■ clearance  number  for  ac,  f - sampling 
frequency  for  uls,  p ■ probability  of  defective,  q * 1-p , and  we  have  the 
following  statements  expressed  In 


SMC. 


Theorem  1.  Let  sc  ■ 1 and  uls  - 2.  Then,  SMC(l)  is  an  irreducible 


Proof . The  SMC  states  are 


(1;  end  (2;  Q2l(*>), 


where 


§12<*>  ■ Bnd 


6 

z-& 


(1.1) 


In  Eqs.  1.1,  $(*)  ■ z*(z-l)  + y,y  ■ pq* , 5 ■ fp,  end  6 * 1-6. 


The  transitional  matrix  of  the  embedded  MC  la 

1 2 


r n 

1 0 1 

2 10 


Even  though  It  clearly  has  period  2,  the  SMC  is  none-the-less  aperiodic 
[6.2  and  6.8].  It  easily  follows  from  the  matrix  that  £ ■ (1/2,  1/2)  is 
the  stationary  (but  not  long  run)  vector. 


Using  the  notation  in  the  Appendix  (see  A. 25), 


Ml 


Aral 


pqj 


and  U2  - 


1 -4 


iSBPSB 


t 


Further  detail*  are  found  in  [6,2]  which  finishes  the  proof. 


The  expressions  (l-qI)/pqI  and  1/fp , In  Theorem  1,  appear  throughout 
the  paper  and  will  hereafter  be  abbreviated  by  the  symbols  and  y’2  , 
respectively.  These  special  primed  symbols  are  used  to  avoid  confusion 
with  standard  notation  (see  A, 25)  and,  at  the  same  time,  to  serve  as  a 
reminder  of  their  origin  (ie,  CSP-1). 

The  principal  measure  of  plan  performance  for  CSP-1  is  the  Practlon 
Inspected  (FI)  functional  which  ie  given  in 

Definition  1.  For  a production  run  of  length  N and  sampling  plan  CSP-1, 
N 

FI(N;1)  -1-J  J]  C2(j) 

{ J-0 


\ 


I 

% 

? 


where  C2(«)  is  the  characteristic  function  for  state  2 - uls  and  v ■ l-f. 

Taking  the  expected  value  of  the  above  functional,  conditioned  by  an 
initial  start  in  ec  (Job  Shop  case),  letting  N approach  infinity,  and  using 
the  Ergodlc  Theorem,  we  have  [6.1  and  6.2] 


AFI(«{1)  - l-\iP2(«;l)  (A2) 

where  the  LHS  of  Eq.  A2  is  defined  by 

Lim  ESC[FI(N;1)]. 

N •+  » 


1.3  HC  Method  (An  Example) . The  MC  method  will  be  briefly  illustrated  by 
applying  it  to  the  MC  model  of  uls.  This  model  consists  of  two  MC  states: 

SN,  the  non-inspection  state  and  SI,  the  inspection  state.  Tho  transitional 
matrix  of  the  absorbing  MC,  derived  from  the  MC  model  of  any  CSP  having  a uls 
phase,  is 


SN 

SI 

A 

SN 

V 

f 

o" 

SI 

qv 

qf 

P 

A 

0 

0 

1 

1 
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where  A it  the  only  possible  target  phase  to  ba  entered,  Tha  pav  for  tha 
ordarad  ensemble  S ■ (SN,  SI)  la  v ■ (v,f)  which  lnducaa  an  Initial  pro- 
bablllty  vactor  (v,f ,0)  for  tha  states  (SN,  SI,  A),  vhara  A la  tha  abaorblng 
state,  tha  othar  two  balng  tranalant,  Thua,  from  Eq,  Al,  wa  raad  to  darlva 
tha  axpraaalon 

<v)fSN,A  + <f>fSI,A  • 


From  tha  Chapman-Kolmogorov  aquation,  a dlffaranca  aquation  for  tha  flrat 
antry  probability  functions  can  ba  derlvad.  c-Trans forming  thla  dlffaranca 
aquation,  wa  obtain 

3ula,A<*>  - «/<*-&) 


whara  6 ■ fp  and  0 * 1-6. 

In  a similar  manner,  § A(z)  can  ba  derived  using  an  (1+1)  x (1+1) 
transitional  matrix  consisting  of  1 transient  and  one  absorbing  states 
[6.2].  Also,  for  this  latter  function,  sea  [6.10,  Chp.  13]  for  a differ- 
ent derivation  which  la  based  on  renewal  theory  and  Bernoulli  trials. 

1.4  Notation  and  Terminology.  Three  essentially  different  plans  will  be 
studied  In  future  chapters.  They  are  dsnoted  by  CSP-12,  CSP-13,  and 
CSP-14.  For  ease  In  indexing  functionals,  CSP-1  will  henceforth  be  denoted 
by  CSP-11.  SMC  models  associated  with  the  above  plana  will  ba  denoted  by 
SMC(k),  k a positive  integer;  In  one  case,  a Markov  Renewal  Process  (MRP) 
model  la  constructed  for  CSP-12  and  is  denoted  by  MRP (2).  A MC  state  with- 
out self  transitions  will  be  called  a trivial  SMC  state;  one  with  self 
transitions  will  sometimes  be  considered  as  a (non-trivial)  SMC  state  with 
a geometrically  distributed  holding  time.  A [functional]  will  usually  mean 
Etc  [functional]  for  the  models  considered.  In  particular,  with  respect 
to  some  other  set  of  models,  A [•]  could  have  an  entirely  different  defini- 
tion. Theorems,  propositions,  and  definitions  are  numbered  consecutively 
throughout  the  paper.  Statement  y of  section  x in  the  Appendix  will  be 
denoted  by  A.xy. 
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1.6  Principal  Results.  For  the  quantities  referenced  below,  6 - fp, 

6 ■ 1-4,  and  v ■ 1-f. 

Sq.  A2  gives  AFI(»;1) ; P2 <••;  1) » y'i » and  y'2  are  given  in  Theorem  1. 

Eq.  B8  gives  ATFI(«;2);  P2 <"°; 2)  le  given  in  Theorem  4 and  TFI  is 
defined  in  Definition  2. 

Eq.  C8  gives  ATF1(»;3);  P2(*{3)  is  given  by  Eq.  C7. 

Theorems  17  end  20  give  AFR(»;2)}  Theorem  18  gives  A7R'(*»;2). 

Theorem  7 compares  ATFI(*;2)  and  API(®;1);  Theorem  14  compares 
ATFI(*{2)  and  ATFI(*»{3). 


2.0  DSI  - GENERAL.  Having  Initially  started  in  the  screening  phase 
(Job  Shop  case),  if  a defect  is  found  in  the  sampling  phase  at  time  n, 

n > I,  Downstream  Inspection  (DSI)  requires  1.)  a return  to  unit  n-I  with 
100%  inspection  of  the  succeeding  I units  and  2.)  entrance  to  the  sampling 
phase  (screening  phase)  if  no  (one  or  more)  defects  are  found  upon  com- 
pletion of  1.).  DSI  is  portrayed  in  Figure  1,  Chapur  1. 

2.1  Introduction.  If  the  DSI  stage  is,  for  the  moment,  intuitively 
looked  on  as  a "pseudophase" , the  Total  Fraction  Inspected  (TFI)  can  be 
obtained  by  treating  it  as  a modification  of  Fl(Njl).  Conceptually,  this 
modification  can  be  broken  down  Into  two  separate  parts.  The  first  is  an 
additive  fractional  increase  due  to  a sum  each  term  of  which,  after 
multiplication  by  N,  ie  equal  to  v mln(k,I)  where  k+1  is  the  duration  of 
the  corresponding  sampling  phase  segment.  The  second  is  a nonlinear  de- 
crease In  FI(N;1)  due  to  the  transitional  requlremtnts  that  Lome  into  force 
upon  leaving  the  "pseudophase".  The  decrease  occurs  because,  upon  finding 
a defect,  there  is  a chance  of  immediate  (at  least  in  the  sense  of  opera- 
tional time)  return  to  the  sampling  phase  rather  than  an  automatic  entrance 
to  the  screening  phase  which  would  otherwise  take  place  in  CSP-11.  The 
finite  probability  of  this  immediate  return  results  in  a fractional  increase 
in  units  not  Inspected  and,  therefore,  a corresponding  fractional  decrease 
in  units  Inspected. 

These  remarks  lead  to  the  following  proposed  solution.  The  nonlinear 
decrease  can  be  dealt  with  by  weaving  the  transitional  requirements  of  the 
"paeudophase"  into  the  SMC  structure  of  CSP-11  thereby  yielding  a new  SMC 
and  its  Fraction  Inspected  function,  FI(N;2).  The  additive  increase  can  then 
be  easily  handled  by  coupling  a new  Incremented  Fraction  Inspected  functional, 
1FI(N;2),  to  FI(N;2).  Adding  these  two  functionals,  we  finally  have 
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Definition  2.  The  Total  Fraction  Inspected  Is  given  by 
TF1(K|2)  * FX0f}2)  + IFI(Nj2), 


In  this  chapter,  ATTI («}2)  is  found  and  compared  with  AFX(»)h), 
h ■ 1,4*  In  Chapter  4,  other  functionals  and  SMC  models  are  studied 
sines  the  one  considered  hers  and  ltu  transient  version,  treated  in 
Chapter  3,  are  not  complete  measures  of  plan  performance. 

2.2  MRP(2)  and  IFI(-t2).  For  ATFI(*;2),  the  solution  proposed  in  the 
introduction  suggests  a mMcl  for  CSP-12  given  in  Figure  3 and  denoted  by 
MRF(2).  This  model  la  a Markov  Renewal  Process  whose  definition  is  given 
in  A. 19  (also  see  A. 21). 


Figure  3 

Model  for  CSP-12  (MR? (2)) 


1 


Concerning  the  model  in  Figure  3,  we  have 

Theorem  2.  MRP (2)  is  a MRP.  Letting  sc  ■ 1 and  ula  - 2,  the  states 

are 

(l;Ql2<*))  and  (2;Q2i(0.  Q22<*>) 

where 


5u(.)  - ^ 

(2.1) 

5a. 

(2.2) 

^22 (*)  “ 

(2.3) 

3 


3 


! 

■s 


I 


I 

[ 
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g&aal.  *q.  2.1  follows  from  tho  modal  for  CSP-11.  Eqa.  2.2  and 
2.3  follow  from  tha  nodal  for  CSP«11  and  tha  introductory  ramarka  to 
Chapter  2 a Inca,  upon  conplatlon  of  a DS1  aagnant,  tha  aanpling  phaaa 
(arroenlng  phaaa)  la  antarsd  with  probability  ql  (probability  1-ql) 
with  oparatlonal  time  playing  no  rola,  MRP (2)  la  a MRP  by  dafinition. 

Tha  dafinition  of  FX(N{2)  la  of  tha  same  form  as  that  given  for 
FI(N;1)  in  Dafinition  1.  Wa  now  dafina  tha  incramantal  functional  in 

Dafinition  4.  Lat  W(>)  ba  tha  following  functional) 


- ]£  R.  + (l-c2(t))R 


whara  l • N2(t)-1  and  Rg  - ain(k.I)  if  the  ath  txit  from  state  2 takes 

place  (krfl)  time  units  from  tha  ath  entrance.  Than  tha  Incremented  Fraction 
Inspected  functional  for  MRP(2)  is 


IFI  (t;2)  - v 


whara 


u ■ 1-f. 


Filtering  out  state  1 from  MRP (2),  wa  obtain  the  pdf  of  the  renewal 
time  for  an  occurrence  of  state  2 which  is  given  by 

Q22(t)  « Q2i*Qi2 (t)  + Q22<t). 


Thus,  averaging  the  time  over  one  renewal  cycle!  we  have 


BIT]  - £ kP[T21  + T12  - k or  T22  - k) 


£ k Q*22  (k) 


- £ k Qai*haCk)  ♦ £ k <JaaOO. 
k k 

Proa  the  mean  value  property  of  tho  a-tranafora,  «e  auat  evaluate 
~*Dg (Q21Q12)  *nd  '*,Di  (Q22) 

at  a ■ 1.  Calling  the  resulte  of  the  evaluation  bj  end  &2»  reapectlvely, 

ve  have 

6 ql 

mi  - J (uH  + y<2  ) and  m2  - 


where 


7 ■ dd-q1). 


Propoaltlon  1. 

E[T]  - (l-q1)^!  + W’2 
Proof . See  above . 

Averaging  U(<)  over  one  cycle  ylelda 


E[W)  - £ kP[W«k] 


22  h66k  + I J «PI+1+3 

k J“0 


(>&) 


+ I0I+1. 


P ■}- 


Since 


»( (1i!ri)  - -j?  (i-8I+1  - sa+Ds1), 


tubs tl cut lot  the  RHS  of  this  equation  for  the  RHS  above  and  simplifying, 

we  have 

Proposition  2. 

B[W]  - 4 (1-S1)  (B2) 


Proof.  From  above. 

We  are  now  ready  to  prove 
Theorem  3.  For  MR?(2), 

IFI<«j2)  - vS<l-B*)f V 

+ "u'2  / 


[a.e.]. 


Proof.  By  the  Strong  Renewal  Theorem  [6.7,  6.9],  we  have 


Lim  UJ$1  E[Wi  . 

N E[T]  * t4-*-1' 


The  theorem  follows  from  this  result,  Props.  1 and  2,  Def.  3,  and 
simplification. 

Corollary. 


AIFI(«;2)  - vBd-S1)/ ) 

\(l-qI)y'l  + u*2  / 


Proof. 

Lim  Esc[W(S>] 

AIFl(-;2)  - 5 


2.3  8MC(2)  and  IFI (»:?).  By  its  vary  definition  ths  functional  W(0 
depends  on  the  sainple  paths  of  a MRP,  including  the  ■elf-transitions  of 
a component  state.  In  particular,  the  fundamental  probability  functions 
(see  A. 12)  of  the  Induced  SMC  (see  A. 28)  are  not  sufficient  to  describe 
W since  they  don't  record  the  self  transitions  of  His.  However,  just  as 
MRP(2)  Induces  a unique  SMC (2) , W also  induces  a correspondingly  unique 
functional,  W*  ( • ) , defined  on  the  chain.  We  first  prove 

Theorem  4,  MRP(2)  induces  a unique  SMC,  denoted  by  SMC(2). 

Proof.  From  A. 28,  SMC (2)  can  be  defined  via  its  pdf's  as  follows 
(cf.,  6.14] 


Q*2(a>  " Ql2<*> 


(4.1) 


Q21(*)  " ^22  | 

* j-0  ' 


* §21^ (1“Q22) 


(*)fc 


(4.2) 


Recalling  the  definitions  of  ui  and  y2  from  Theorem  1,  we  have,  from 
the  derivatives  of  Eqs.  4.1  and  4.2, 


ui  ■ u'l  and  U2 


where 


6 - 5 (1-q1) . 
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From  A. 25,  we  thus  obtain 


PjC-52) 


(l-sx)y\ 

(l-qI)y'l 


and  P2(-{2) 


(l-ql)y'1  4.  u'2 


(B3) 


where  1 and  2 on  the  LHS's  are  sc  and  uls*,  respectively. 

The  transitional  matrix  and  stationary  vector  of  SMC (2)  aro  the 
same  as  in  Theorem  1 which  finishes  the  proof. 

Our  goals  now  are  to  find  E[W*]  and  E[T*],  To  this  end,  we  prove 

Theorem  5.  The  functional  W(>)  induces  a well  defined  functional, 
W*(.),  on  SMC (2) . 

Proof . W*  is  implicitly  defined  through  the  following  equations. 
Conditioning  on  the  number  of  self  transitions,  j,  of  uli,  we  have 


P[W*-k]  - ]|T  P[W*-k|j]P[j] 

J-o 


- 2 aj<k> 

i 

where 

8j (k)  ■ P[W*«k  and  j repetitions]. 

Noting  that  a., (k)  can  be  defined  in  terms  of  a8(k),  s < j , we  can  derive 
a set  of  equations  relating  the  above  a's.  For  ease  in  notation,  we  first 
define 


Than,  for  0 i k & (J+1)I,  k a fixed  Integer,  we  obtain  the  eyatea  given 
below. 

aj(k)  - (Sq^a^aBCk)  + (8q)Iaj(k-I)  (5,1) 


whare 

I i k i Jl, 

- («qI)aj.1*B(k) 

where 

0 S k < I, 

- (6qI)(Bk^I)aj.1(jl) 

where 

jl  < k < (j+l)l,  and 

- (dq^ajdl) 

where 

k « (J+1)I. 


From  Eq.  5.1,  we  have 


JX 

E 


k-I 


(5.2) 


(5.3) 


(5.4) 


Jl 

* (BO)1  £ 
k-I 


aj-1(k-I) 

zk 


- X + Y 


(5.5) 
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Adding  sero  on  the  RH8  of  Eq.  5,5  and  changing  indlcaa  In  the  tom  Y, 
we  have 


(j-l)  I 


X + Y » X + 


W £ 

' e-0 


hdS£L 

*• 


+ R-R 


where 


2 itafs2'  • 

k-0  * 


Grouping  one  R with  X,  ualng  Eq.  5.2  to  tranaforn  the  oecond  R,  recalling 
that  for  J-l,  0 S k S JI,  and  ualng  the  definition  and  convolutional 
property  of  the  a-transform  [6.3,  6.12],  va  have 


RHS(5.5)  - (X+R)  + y-R 


- (dqI)aj-1(a)B(2)  + Y 


k-0 


(5.6) 


Again,  ualng  the  definition  of  the  x-tranefora,  noting  that  in  Y the 
sum  la  from  0 to  (j-l)  I while  on  the  LHS(5.5)  the  aum  varies  from  I to 
JI,  and  adding  the  laat  term  of  Eq.  5.6  to  the  LH8  of  Eq.  5.5,  we  obtain 


aj(z)  - (LHS(5.5)  + R + Sj)  - Sj, 


where  a varies  from  jl+l  to  (j+l)I, 


- <X+R)  + <Y+S'  .)  - S' 

j-l  j-l 


Ot+R)  + (&J  aj_i(a)  - Sj_i  (5.7) 
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where 


s varying  from  (j-l)I  + 1 to  JI, 

From  Eqe.  5.3  and  5.4,  we  find  that 


SJ-1 


Thus,  the  Sj  term  cancels  out  in  Eq.  5.7  leaving  us  with  the  final  equation 


ij(z)  - (6qI)aj_i(z)B(z)  +^|a-^Iaj_1(z) 


Eq.  B4  can  now  be  solved  Iteratively,  if  desired,  thereby  proving 
Theorem  5. 

W*  can  also  be  explicitly  defined  in  the  same  way  as  W (except  that 
R*  can  vary  from  zero  to  infinity).  The  importance  of  Theorem  5 is  its 
use  in  Proposition  4. 

Proposition  3.  Let  sc  * 1 and  uls*  - 2.  Then,  we  have  for  the 
renewal  time,  T*,  for  state  2, 


E[T*] 


(l-ql)  U 1 + U*2 
(1-q1) 


Proof. 


E[T*]  - kP[T*-k] 


53  kQ*l*  Qi2(k) 


Evaluating  tha  last  axpraaalon,  va  hava  tha  rasult. 

Propoaition  4.  Avaraglng  W*  evar  ona  renewal  of  atata  uTi* , va  hava 

E[W*]  . likfill  (B6) 

m-q1) 

Proof.  Tha  ranaval  tlaa  la  givan  by  T*  in  Propoaition  3 and  haa  pdf 

Q$l*Ql2* 

Summing  aj(z),  in  Thaoram  5,  from  ona  to  infinity,  va  have  from 
Eq.  B4 

A(*)-i0(t)  - («qI)A(i)fi(a)  A(z)  (B7) 


vhara 

A(z) 


•» 

L 

J-o 


ij(«) 


From  the  mean  value  property  of  the  z-transform  and  the  definition 
of  aj (k) , va  thua  have 

E[W*]  • -zDzA(z)  (at  z«l) . 

The  proof  ia  finished  by  evaluating  tha  RHS  of  this  laat  equation 
and  simplifying. 

We  are  now  ready  to  prove  the  analogue  of  Theorem  3 (where  the  IFI 
functional  la  considered  to  be  a quantity  dependent  on  the  plan  but 
evaluated  on  the  model)  in 

Theorem  6.  For  SMC(2), 

lFI(-;2)  - vB(1-Si)P2(-}2),  [a.e.] 
where  2 ■ uTs*  and  P2(*;2)  ia  given  in  Theorem  4. 


Proof . Again  by  the  Strong  Renewal  (or  Ergodic)  Thao ram, 


Lla 

H+* 


W*(H) 

N 


. likfiiL  . . 

fi(l-ql)  (l-q^)p j + wj 


from  Eqa.  B5  and  B6, 

- Sd-S1) 


- - - 

(l-q1)^!  + M*2 


- 6<l-SI)P2<-}2), 

from  Eq.  B3. 

Multiplying  by  v flnlahaa  the  proof. 

In  particular,  the  equatlona  in  Theorems  3 and  6 agree,  aa  they 
should. 

Corollary. 

AIFI(«;2)  - 

Proof.  The  same  as  in  the  Corollary  to  Theorem  3. 


2.4  TFI(«;2)  and  Comparisons.  Given  the  real  number  p varying  over  the 
open  unit  Interval,  the  Inequality  "1-q1  < 1",  Theorem  1,  and  Theorem  4 
imply 

P2(-}1)  < p2(«;2) 

for  SMC(l)  and  SMC(2).  We  shall  show  a similar  result  for  AFI("»;1)  and 
ATF1(«;2),  Before  doing  this,  we  record  the  following  result 

ATFI(»|2)  - AFI(«{2)  + AIFI<«;2) 

■ (1-vP2(«j2))  + vS(l-6I)P2(“{2) 


33 


I 


v >4 
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, 


- 1-vPj  (•{  2)  (6  + gx+x) 

Theorem  _7.  For  p in  eh*  open  int*rv*l  0 < p < 1, 


(B8) 


ATFI(*;2)  2 AFI<«;1)  iff  Sd-S1)  2 qxo\ 
where  a\  ■ p,|/(wV  + wi). 

Proof . From  Eqs.  A2  and  B8,  th*  *t*t*a*nt  is  *quiv*l*ne  to 
P2(-il)  * Pa <-}2) <«  + Bx+1). 

This  lnsquslity  is,  in  turn,  equivalent  to 
(«  + BI+1)(p\  + wi)  i <l-q1)u*i  + ui 

■ <u\  + Mi)  - qVi 

Dividing  through  by  (u'i  + wi),  wa  have 
(S  + 8X+1)  i l-qIo\ 
or 

l-(fi  + 0X+1)  2 qxai. 

However, 

l-<«  + 61+1)  - 8<1-6X) . 

Thus , 

8(1-BX)  2 qM 

which  finishes  the  proof. 

For  p » 0 or  1,  the  formulas  in  Theorem  7 sr*  equal. 

Another  type  of  CSP,  denoted  here  by  CSP-14,  la  the  plan  obtained  from 
replacement  of  DS1  in  CSP-11  by  USZ.  For  CSP-14,  th*  SMC  model  is  straight- 
forward since  the  limited  Inspection  scheme  runs  with  the  natural  flow  of 
operational  time.  For  this  model,  we  have 

Proposition  3.  Letting  sc  ■ 1,  uls  ■ 2,  and  ck  (or  USI)  • 3, 
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a-ei  r iif'  if  rifelit'ifi' 


P20s4) 


vj 


"wT 


(l-q*)pl  + 


+1 


Proof . If  9 Is  the  stationary  vector , using  the  SMC  model  for  the 
ck  phase  found  in  [6.2],  we  have 

<l-qJ)e  - (1-q1,  1.1). 

The  rest  of  the  proof  easily  follows  from  A. 25  given  that  113  ■ I, 

It  clearly  follows  from  Proposition  5 that 
API (*»;4)  - 1-vP2(-;4) 

Thus,  to  compare  AFI(*}4)  and  ATF1 (°»{2) , It  would  suffice  to  compare  the 
expressions  which  are  analogous  to  those  in  Theorem  7.  However,  to  avoid 
a long  proof.  It  also  suffices  to  give  the  following  probabilistic  argument. 

Upon  finding  a defect  In  the  sampling  phase,  1 new  units  are  inspected 
with  CSP-14  while,  on  the  other  hand,  at  most  vl  new  units  are  inspected 
under  CSP-12.  Since  the  transitional  probabilities  are  the  same  from  the 
limited  inspection  (pseudo)  phase  in  both  plans,  the  proof  is  finished. 


3.0  D3I  - TRANSIENT.  Two  interpretations  of  DSI  for  the  transient  case 
ere  treated  in  this  chapter.  The  first  version  is  the  transient  case  of 
DSI,  already  dealt  with  in  Chapter  2 for  infinite  N.  That  is,  DSI  is 
applied  to  both  phases  of  CSP-11  with  constant  "pseudophase"  transitional 
probabilities.  In  contrast  to  the  first  version,  the  second  plans  "pseudo- 
phase"  transitional  probabilities  to  sc  (or  ula)  monotonically  increase 
(or  decrease)  with  increasing  duration  in  the  sampling  phase,  until 
truncated  by  1-q1  (or  q1).  One  can  infer  from  this  monotonicity  that  DSI 
is  applied  only  to  the  sampling  phase  in  the  following  sense.  If  a defect 
is  found  during  a sampling  segment,  k + 1 time  units  from  entrance  to  this 
particular  segment,  then  only  the  previous  x units  are  to  be  inspected, 
where  t ■ min(k,  I).  Upon  completion  of  this  modified  DSI,  uls  is  entered 
if  no  defects  are  found  (with  probability  qk) ; otherwise,  sc  le  entered 
(with  probability  1-q^) . 

3.1  Introduction.  The  analysis  of  each  version  involves  three  stages. 
However,  for  convenience  in  the  final  section,  a fourth  stage  ia  added  for 
the  second  version. 
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In  Che  primary  stage,  Che  modified  aaapling  phase  la  partitioned 
into  I + 2 SMC  states  which  are  consecutively  labelled  0 through  I 
and  b.  The  purpose  of  this  splitting  is  the  derivation  of  an  expression 
for  the  monotonically  increasing  portion  of  the  functional  W(*). 

In  the  secondary  and  tertiary  stages,  SMC  states  1 through  I are 
recombined  into  a preliminary  macrosteta,  c’ ; it,  in  turn,  is  combined  with 
SMC  state  0 to  form  the  final  SMC  state,  c.  The  purpose  of  these  latter 
two  manipulations  la  to  facilitate  the  derivation  of  an  expression  for 
the  truncated  portion  of  W(»)  by  avoiding  complex  sums  of  products  of 
characteristic  functions. 

The  chapter  concludes  with  a comparison  between  the  TFI  functional 
of  each  version  for  infinite  N (or  t). 

3.2  Strict  DSI.  In  order  to  analyse  the  transient  case  of  DSI,  the  SMC 
model,  shown  in  Figure  4,  is  used.  It  is  denoted  by  SMC(3). 

Figure  4 


Model  for  CSP-12  (SMC(3>) 


Jl-q1) 


sc  ■ a 


5 (1— q1) , r ■ l-dq1,  6'"  4/r,  and  8'  ■ 8/r 


Concerning  this  model,  we  have 
Theorem  8.  SMC (3)  is  an  irreducible  SMC. 
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Proof . The  s-transformed  pdf' a of  tha  atatea  making  up  SMC (3), 
cogathar  with  thair  corraapondlng  transitional  probabllltiaa  In  tha 
ambaddad  MC,  ara  glvan  balow. 

§k,fcfrl(a)  " */•»  ^k,k+l  ■ 81  for  1 S k s 1-1 
Qka^1)  " 3"/*  > qka  ■ ^ * for  1 i k i,  1 
6k0(‘>  * 4q*/s,  4k0  " for  1 i k i I 
Q0a<*)  * fi/ts-fiq1),  q0a  • S' 

Q0l(t)  - ^/(s-Sq1),  q01  - 6* 

Qa0<«)  " q1C*-q>/«K*> » qa0  * 1 

Qba(*)  ■ T/C-i).  qbt  ■ i-q1 

Qb0(*>  “ SqI/(a-S),  qb0  - q1 

Tha  aquations  follow  from  SMC (2)  by  observing  that  uTs*,  since  its  holding 
time  pdf  is  geometric,  canjse  regarded  as  a MC  state  which  Jumps  to  itself 
and  sc  with  probabilities  ? and  ?,  respectively. 

Ordering  the  states  of  SMC(3)  in  the  same  manner,  from  left  to  right, 
as  they  ara  ordered  in  Figure  4,  we  obtain  the  linear  system  of  equations 
from  the  matrix  equation  a ■ eT,  T the  embedded  MC  transitional  matrix. 


1 

6'a0  + 5 Se1  + (l-qI)eb  ■ ea  (8.1) 

J-l 


*a  + 4q*  2 eJ  + ql#b  " *0  ^8'2^ 

J 

8'eo  “ *1 
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bti&ifttatti 


win 


for  1 

M 

VI 

< 1-1. 

8ek 

* *k+l 

end 

8ei 

• «b 

From 

this 

system,  exclusive  of 

Eqs.  8.1  and  8.2, 

we  obtain 

•k  " 

&k-l8’e0,  1 S k S I 

(8.3) 

and 

•b  ■ 

8I8,s0 

(8.4) 

Eqs. 

8.1, 

8.3,  and  8.4  Imply 

«'  *0 

+ B’u-q^U-e1)^  + 

(l-q^so  - s. 

or 

- 

±sL 

l-dq- 

l «0  * *a 

(8.3) 

Sine*  e is  normalised,  we  have  from  the  sum  of  ita  components,  Bqs.  8.3, 
8.4,  and  8.3 

(8.6) 


where 

G - (l+SXl-dq1)-  81+2. 


Thus,  Eqs.  8.3,  8.4,  8.S,  and  8.6  Imply 


•b 


- 1 , and  ek  - 


where  1 i k i 1. 

Differentiating  the  §'a,  multiplying  through  by  minua  one , end 
evaluating  the  reaulta  at  t ■ 1,  ua  have  (adding  terms  where  appropriate) 


“o 


1 

l_gql 


Wb  ■ £ * and  Mfc  * 1 

for  1 g k i I. 

This  finishes  Theorem  8. 

Corollary.  For  SMC (3), 

I 

<»0  + 2 *1  + “b  " *g<**2>  (8.7) 

1-1 

where  the  LHS  ref ere  to  SMC (3)  and  the  RHS  refers  to  SMC (2). 
Proof . 

1 

M0*0  + ]IC  uk*k  + Mb*b  " 

k-1 

- 1/0 

«...  + i/o  - (^(g)  + ■$ 

‘ erfcn 


Thus 
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f 


LHS  (8.7)  - GPa(»;2)(l/G) 

- M-.2) 

Relative  to  SMC (3),  wa  have 

Definition  5.  The  monotonically  Increasing  portion  of  W(t) , divided 
by  t,  and  considered  as  being  defined  on  SMC(3)  Is 

t-1  I 

^ RC^(n)  (l-C^Cn+D) 

W'  (t)  . n-0  k«l 

t t 


Thus  we  can  also  write 

IK*  (t;2)  - v . 

Operating  on  this  equation  and  the  RHS  of  the  equation  in  Definition  5 
with  Ege[‘],  we  obtain 


AIFI1 (t;2) 


t-1  1 


»ak<n) 


(Cl) 


which  can  be  evaluated  by  using  the  t-transformed  Backward  Equations  for 
SMC(3);  see  [6.1]  for  an  example  of  such  an  evaluation.  Letting  t approach 
Infinity)  we  have 


I 

AIFI’  («;2)  - 6 2 kok  <C2) 

k-1 


Since,  from  the  last  part  of  Theorem  8 and  from  A. 25 
ak  “ (^g“)  CGP2 C-;2>) 


- 68kP2(»;2) 
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and 


®b  " (^o — ) <Wb><0*2<"l2)) 


- 81+1P2(*»}2) 
wi  have,  from  Iq.  C2, 


6 S tok  " M*V*2)  db  (rit^) 


S(l-81-«I$1)P2(-,2) 


Prom  A. 27, 

W"  (b)ob  - 6I«b 


where  W"  (t)  is  tha  constant  port  of  W(t) . Tharafora,  adding  the  last 
two  axprasalons  and  performing  tha  indicated  operations,  wa  hava 


r 


ll»  SIHiSll  . 8(i-|I)P2(«t2), 

a result  which  agrees  with  that  obtained  in  Chapter  2. 

In  order  to  deal  with  tha  constant  part  of  the  functional  for 
finite  t,  we  proceed  to  reduce  SMC (3)  to  a more  manageable  model  as 
described  in  Section  3.1. 

8tage  two  consists  in  filtering  out  the  states  1 through  I in  SMC (3), 
an  operation  which  leads  to  a new  models  SMC (4).  The  details  and  results 
of  collapsing  SMC(3)  into  SMC(4)  are  given  in 

Theorem  9.  SMC (4)  is  an  Irreducible  SMC  obtained  from  SMC(3). 

Proof.  Let  c*  be  the  ordered  ensemble  composed  of  tha  states  1 
through  1.  Noting  that  the  pev  of  c'  is  (1,0,0, — ,0),  1-1  saroa,  we 
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apply  combinatorial  analysis  to  gat  (dropping  ths  argument  s) 


Vo  “ Oio  + Ql2Q20+  — + Q12Q23* *Qio 


*-B  ' la  ) ; 


In  ths  saaa  way,  wa  also  obtain 


«...  • <i-(f)l> 


c a 


(9.1) 


(9.2) 


(9.3) 


The  remaining  results  concerning  SMC (4)  can  be  easily  derived 
from  the  above  equations.  In  particular,  sea  A. 29. 

Corollary. 

SMC (4)  < SMC (3) 

where  "<"  is  the  filtration  ordering  relation. 

Proof . SMC(4)  is  a filtration  of  SMC(3)  by  the  proof  of  Theorem  9 
and  A. 29. 

Stage  three  consists  in  filtering  out  state  c*  in  SMC(4)  yielding 
SMC(5).  The  details  and  results  are  given  in 

Theorem  10.  Filtering  out  c'  in  SMC (4)  yields  a new  SMC,  denoted  by 


m\ 


Proof.  Lot  tha  ordarad  ansaablo  (0,c* ) bo  donotod  by  c.  Than, 
tho  pov  for  c la  tha  vaetor  (1.0). 

First  construction.  Applying  combinatorial  analysia  to  tha  trans- 
form*!' pdf's  in  Thaoram  9 (Eqa.  9.1,  9.2,  and  9.3),  wa  hava 


5c.  • {e  <5oc'5C'o)j}  5, 

'j-0  ' 


jZ  <5.c'5i 

lj-0 


Qe' 0^  f ^Oc'  ^c'a 


Qoa  + Qoc*  Qc1  a 

i-Ooc'^c*  o ~~ 


whara 


c(a)  - aI+l(s-(B  + Sq1))  + dB(qB)1  . 


Similarly, 


Qcb  “ | <Qqc'  PcP  0>J  | ^oc'^c'  b 


^ Qoc'  ^c'b 
" i-^oc*  9c’  0 
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c(e) 


1+1 


(C4) 


Second  construction.  Since  SMC (5)  ii  the  model  to  be  used  in 
deriving  an  expression  for  the  constant  part  of  IFI(t}2),  t finite,  we 
will  sketch  the  more  elaborate  SMC  method.  The  relevant  absorbing  SMC 
has  transient  states  0 and  c’ ; absorbing  states  a and  b.  Using  A. 21, 
setting  a - A,  b ■ B,  and  c'  ■ 1,  we  obtain  the  following  transformed 
Backward  Equations  (four  others,  not  needed,  are  omitted). 

*0A  " §0lPlA  + §0A^AA 

^1A  " QlO^OA  + QlA^AA 


^OB  * $01^1B 

*11  ’ *1O*0B  + $1B*BB 

fBB  " 

A 

Solving  for  in  the  first  set  of  three. 


t 


OA 


~ ^oAo 


Since  the  pev  of  the  ordered  ensemble  (0,1)  is  (1,0),  the  above  equation, 
Eq.  Al,  A. 13,  and  A. 22  innly 


0 
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Eq , C3 


Solving  for  PqB  In  Che  second  set  of  three, 


p m Ho <QqiQib> 
“ i-M.o 


Again,  since  Che  pev  ■ (1,0),  the  above  equation  together  with  Eq.  Al, 
A.  13  and  A. 22  imply 

$cb  " *0B 


■ Eq.  C4. 

SMC (5)  has  three  states:  a,  c,  and  b.  The  transformed  pdf's  for 

transitions  of  a to  c,  b to  c,  and  b to  a are  the  same  as  those  for  a to  0, 
b to  0,  and  b to  a,  respectively,  in  SMC(4). 


We  finish  the  proof  of  Theorem  10  by  remarking  that  states  a and  c 
cannot  be  combined  since  a pev  (from  state  b)  does  not  exist. 

Corollary. 

SMC (5)  < SMC (4). 

Proof.  Construction  of  the  state  c in  SMC (5)  is  equivalent  to 
filtering  out  state  c'  in  SMC(4).  SMC(5)  is  an  irreducible  SMC  by  A. 29. 

We  can  now  derive  an  expression  for  the  constant  part  of  IFI(t;2)  In 

Theorem  11.  Given  the  3 state  model,  SMC (5), 


1F1"  (t ; 2)  - vl  |£kl£h£kiEi 


Proof.  Nb(t)  gives  the  number  of  entrances  to  state  b by  time  t. 
The  number  of  exits  from  state  b is  clearly  Nb(t)“Cb(t) , the  second  term 
being  the  characteristic  function  of  state  b. 


9 


Proof.  Apply  Ea[ • ] to  Eq.  C5. 

In  order  to  use  Eq.  C6,  we  must  be  eble  to  develop  a ueeeble 
expresalon  for  the  seen  of  the  renewal  function.  Towarde  thle  end 
we  prove 

Proposition  6.  Let  N(t)  be  a renewal  proceea.  Then 

. • 

E[H(t) ] - H0*F*  I ^ 

J-0 

where  F la  the  renewal  pdf. 

Proof . 

P[H(t)  - n]  - P[U<nfl)  > t]  - P[U(n)  > t] 

- HoeF^®4"1)^)  - H0*F<n>(t) 

• Fn(t) 

Thus . 

#n<*>  “ fio(^)n(l-P) 

Therefore. 


vhara  the  invars*  axpraaalon  la  shorthand  for  ths  summation. 

Proof.  Renewals  of  stats  b,  starting  in  stats  a,  form  a dslayad 
rsnswal  procsss  with  initial  probability  function  Fab«  Than  Proposition  6 
flnlshss  the  proof. 

Corollary  2. 

Lim  Ea[Nb(t) ] 


Proof.  From  Corollary  1 above,  we  have 

Lim  Bat«b(Q1  „ Lim  Hp»8(t) 

t-M>  t t-*»  t 


„ Lim  S(t) 
t+»  t 


where  8(t)  - Ho*F4^a(l-r^j5)"1(t) , 


Lin 

s+1 


M 


a-^bb) 


W«> 

-«DgPbb(«) 


(at  t • 1) 


■ flab* 


Tha  aacond  aquality  follows  from  tha  simple  argumant  that  If  8(*) 
la  a aaquanca  with  limit  A,  then  the  Cealro  limit  of  S(*)  also  exists 
and  la  equal  to  A. 

From  the  aacond  corollary  to  Proposition  6,  we  have  in  addition 


MSfeM.Jskfilj  _ „I4ab 

as  t approaches  infinity,  since  the  aacond  term  goes  to  saro. 

The  main  results  about  IFI(t;2)  are  summed  up  in 
Theorem  12.  For  tha  transient  case  of  CSP-12,  we  have 
AIFI(t ;2)  - AIFl'(t;2)  + AIF1"  <t{2) 

+ vlj 

where  the  first  and  second  terms  on  the  RHS  are  evaluated  using  SMC(h), 
h « 3 and  5,  respectively. 
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Proof . Combine  Eqa.  C2  and  C5  (taking  tha  Unit,  wo  gat  v tlmaa 
tha  raault  ualng  V and  W"  ). 

Whan  t la  finite,  in  ordar  to  compute  Ea[Nb(t)],  we  naad  to  know 
Fab(t)  and  ?bb(t) • 8 Inca  8MC(5)  haa  3 atatee,  va  hava  9 Backward 

Equation*,  only  ona  of  which  la  naadad  for  tha  maan  valua  of  tha  above 
ranawal  function.  Tha  following  atataaanta  akatch  tha  raaulta. 

Prom  Theorem  10,  A2.1,  and  A1.4,  wa  hava 

*bb  " Qbc^cb  + $ba$ab  + 3b 
Thia  aquation  la  aqulvalant  to 


* ■*•(!&)  + »t- 


1 - Oba*ab  + Qbc*cb 


But,  LHS  ■ Fbt,.  Tharafora, 

^bb  ■ Gba^ab  + Qbc^cb* 

From  Theoram  10, 

Qbc  " ,nd  ^b*  " ’ 


Applying  combinatorial  analyala  to  tha  tranaformad  pdf 'a  of  SMC(5), 
we  have 


I (Qca^ac^  } Qcb 


■ $cb/U-$»c3c«> 


^•b  " 1 1 (^*c^ca^  | Qac^cb 


Qac^cb/ ^"^acQ  c*)  « 


From  theee  aquationa,  B[N^><t) ] /t  can  ba  computed  [cf.,  6.1]. 

Tha  uaa  of  SMC  (3)  auggaata  tha  following  altamative  traatmant  of 
CSF-12,  Inataad  of  aplittlng  uli*  Into  I + 2 atataa,  va  apllt  it  Into 
an  infinite  number  by  aplittlng  atata  b into  tha  atataa  b(J),  1 < j a ** , 

Tha  raaultlng  modal,  SMC (6),  conalata  of  two  nontrivial  SMC  atataa 
(a  and  0)  and  an  infinite  number  of  trivial  SMC (la,  MC)  atataa  (1  through  I 
and  tha  b(j)'a).  For  tha  long  run  caaa,  wa  can  obtain  AIFI(-;2)  via  tha 
tranalant  caaa  aa  ahown  in 

Propoaitlon  7.  SMC (6)  la  an  infinite  atata,  Irreducible,  and 
poeltlva  recurrent  SMC.  Tha  raault  for  XFl(t|2)  for  SMC (6)  la  tha  aame 
aa  pravloua  raaulta. 


Proof.  For  b(J),  li]<",  wa  have 
°b(J)  " 


(7.1) 


»*b(j)  " 1* 

Thua  t*b(j)b(J)  " l/ab(J)  which  la  finite,  proving  tha  chain  poaltiva 
racurrant. 


For  tha  functional,  it  aufflcaa  to  deal  with  tha  part  defined  on 
tha  b(j) 'a,  W’". 

tgl  «• 

lll/v  £ £ Cb(j)(n)(l-Cb(j+i)(n+l)> 

(t)  . n-0.,1^ 

t t 


Taking  the  naan  value , conditioned  by  an  initial  entrance  from 

•tate  a, 


8a(W"'  <t)] 
t 


| J jpab(j)(n> 

t 


which,  aa  t approachae  infinity,  approaches 


m 

«2gI+1P2(.}2)  2 9J 
j-0 


3ob 


by  Bq.  7.1  and  the  Labeeque  Dominated  Convergence  Theorem  (for 
•aquancea). 

Proposition  8.  The  models  used  for  CSP-12  are  ordered,  w.o. 
filtration,  as  follows. 

SMC (2)  < SMC (5)  < SMC (4)  < SMC(3) 


and 


SMC  (5)  < SMC (6) 

Proof.  Corollaries  to  Theorems  9 end  10  imply  the  first  ordering. 

By  filtering  out  states  b(j),  J i 2,  we  get  the  second  ordering. 

If  we  split  state  a into  its  component  MC  states  and  stats  0 into 
a MC  state  in  SMC(6),  we  get  (S)MC(7)  > SMC  (3),  SMC(3).  If  we  instead 
split  a and  0 as  before  but  now  split  b by  treating  it  as  a MC  state,  we 
get  (S)MC(8)  > SMC (3) , SMC(3),  Clearly,  (S)MC(7)>(F)MC(8) . MC(8)  can  be 
thought  of  as  a finite  state  MC  model  which  fills  the  role  of  the  initial 
MC  model  described  in  the  Introduction  to  Chapter  1,  though  the  construc- 
tion is  backwards  from  that  description. 
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3,3  Liberal  DSI.  To  obtain  a mora  libaral  DSI,  we  altar  tha 
following  tranafomed  pdf' a for  atataa  0 through  1-1  in 


Thaoren  13.  Tha  DSI  sampling  plan  CSP-13  la  obtalnad  from  tha 
SMC (3)  modal  of  C8P-12.  Tha  result,  SMC (3),  la  an  lrraduelbla  SMC. 

Proof.  Tha  appropriata  quantities  and  propartiaa  ara  glvan 
balow. 

Q0.  " 0.  <loa  “ 0 
$01  - B/(a-6),  q01  - 1 
Qka  ■ fiV* . q,.-  5 

qk0  ■ «qk/a,  qk0  - «qk 

whara  1 & k < 1-1 

Tha  othar  tranaformad  pdf's  remain  tha  aama  aa  thoaa  for  SMC (3). 

Ordarlng  tha  atataa  a,  0,  1,  I,  and  b,  wa  obtain,  from  tha 

stationary  vactor  aquation,  tha  system  of  aquations  now  givan. 

6 t (W 

j-1 

•a  + 5 2 qJ*J 

J 

gk-1. 

B a0 

whara  Ukil,  and 


3)ej  + (l-ql)ab  - (« 


(13.1) 


+ qIeb  - a. 


(13.2) 


«k 


(13.3) 


eb 


(13.4) 
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Proa  Iqs.  13.2,  13.3,  and  13.4,  wt  gat 


• l-TSq) 


(13,5) 


Since  eha  components  of  a are  normalltad,  va  obtain,  together  with 
Eq.  13.3, 


•o  G 


(13.6) 


where 


G - «p(l-(6q)1)  + (l-$q)(l+fi-8I+1) 
Eqa.  13.4,  13.5,  and  13.6  imply 


, 1 S k < I 


#K  . 

b G 


Similarly,  from  the  derivatives  of  the  transformed  pdf's,  we  obtain 


ia  “ “9-  . V0  " I • wb  * 7*  *nd  ^k  " 1 


where  1 i k i I. 

3.4  Comparieon  of  CSP-12  and  CSP~13.  In  the  equations  to  be  derived 
in  this  section,  P2(»{3)  ie  the  long  run  percentage  of  time  spent  in 
state  b**2  in  the  three  stage  reduction  of  SMC(3)  to  SMC(2)  which  is  the 
analogue  of  SMC(2)  for  CSP-13.  P2(»{3)  can  also  be  directly  obtained 
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from  SMC (3)  by  filtering  out  the  ititu  0 throu|h  1 and  b,  again 
yielding  SMC (2) , This  latter  filtration  ia  aqulvalant  to  the  SMC 
method  applied  to  the  ordered  eneemble  0,  1,  •••,  1,  b),  with 
pev  • (1,  0,  •••,  0),  1+1  aeroe,  to  obtain  the  two  etate  model 
for  CSP-13. 

Given  the  atatlonary  vector  components  and  the  etate  mean  time 
valuea , from  Theorem  13,  we  get  the  o’e  for  CSP**13. 


ov  - agVt-sS),  1 S k 1 1 


(13.6) 


peti-ceo1.  + ^ 


(l«Sq) 


(1  ■ a and  2 * b). 

Applying  the  Brgodic  Theorem  and  Eqa.  13.6  and  13.7  to  the 
functional  W(t),  defined  as  SMC(2),  yields 

Urn  IlMl  . S j ; kak  + SI«„ 


- 652P4(-i3)DB  (p|— ') 


"‘C’ 


Upon  taking  the  limit,  tha  definition  of  XFI(t|3),  analogous  to 
Definition  4,  glvaa 

AI«(-}3)  - v$(1^I)P2(-(3>. 

Adding  AFI ("{3)  to  tha  above  leads  to  the  final  aquation 
ATFI(*»j3)  - l-vP2(-|3)(«+eI+l)  (C8) 

With  regard  to  tha  last  equation,  we  have 
Theorem  14,  For  p in  the  open  unit  interval, 

ATPI (*»}3)  < ATF1  (»;2) . 

Proof,  Tha  statement  is  equivalent  to 
P2(-;3)  > P2(-}2) 
which  is  implied  by 


< 1~*1, 


Dividing  both  sides  by  p and  using  the  theorem  on  geometric  sums,  the 
above  inequality  is  equivalent  to 


1-1  1-1 
(i+  ]£  <e<0J)  < (i+  £ qJ ) 

j-i  ' v j-i 


OT 

811+Sj]  < [1+S2] 

But  0 < 1 and  0s } < s2,  for  p between  tero  and  ona.  The  cases  for  p * 0 
and  p * 1 lead  trivially  to  the  same  formulas, 

To  handle  the  transient  case  of  CSP-13*  SMC(J)  is  used  for  the  in- 
creasing part  of  W.  The  constant  part  of  W is  handled  in  the  same  way 
as  the  corresponding  constant  part  of  W is  handled  for  CSP-12,  That  is, 


SMC (3)  is  collapsed  Cor  filtered)  to  8KCC4)  which  in  turn  Is  collapsed 
to  SMC(T).  This  analogous  two  stage  process  for  CSF<*13  Is  briefly 
given  in 

Theorem  15.  For  CSP-13,  filtration  gives  the  following  ordered 
set  of  models  t' 

SMC  (5)  < SMC  (7)  < SMC  (3). 

Proof.  Combining  states  1 through  1,  in  SMC (7),  Into  state  c1 
as  Is  dona  with  SMC (3),  ws  have 

Qc'b  " QiaQsi*— “’Gib 

- (S/s)1 

Similarly , 

Qc’o  “ Ql0+Gl2Q20+* • ,+Ql2Q23* — *Qio 


Qoc*  ■ S/(s-S) 


-irr  (1_  (i)1)"  5c'° 


Secondly i combining  states  0 and  c'  Into  the  new  state  c is 
similarly  accomplished  and  yields 

A Gpc'Gc'b 

Qcb  ' l-0oc'dc’O 


| 

i 
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f ! 


I 


^Oc^c^a 

^Oc'^c’O 


The  corresponding  q's  era  given  by 


,eb.IkiaiiI 

See  a 


where 

A • p + Sq(Sq)1. 


i 


Once  egeln,  the  constant  pert  of  the  functional  W(t)  le  given  by 
tffbCt)  Peb(t)  ) 


4.0  DSI  AND  OTHER  FUNCTIONALS. 


4.1  Introduction.  The  TFI  functional  makes  a distinction  between  the 
two  plana  treated  in  Chapter  3 in  terms  of  the  "pseudophase"  transi- 
tional probabilities.  However,  because  of  its  vary  definition,  TFI  does 
not  explicitly  take  account  of  multiple  inspections  of  a given  production 
unit.  That  is,  TFI  is  defined  in  terms  of  an  operational  tima  which  is 
measured  by  a flow  of  successive  and  nonrepeating  production  units.  In 
this  chapter,  a new  functional,  along  with  a variation,  is  introduced  to 
augment  TFI  as  a measure  of  plan  performance.  The  functional  is  Fraction 
of  Repetitions  (FR) . It  will  be  analysed  only  for  the  first  type  of  plan 
. Furthermore,  FR  is  chosen  as  the  principal  functional  because 
1.)  it  ip  naturally  normalised  and  2.)  its  long  run  momenta  can  be 
naturally  derived  from  those  of  the  transient  case  with  a certain  amount 
of  ease.  Short  run  higher  momenta  for  its  variant  cannot  be  obtained  so 
readily;  indeed,  appeal  must  be  made  to  the  Strong  Ergodic  Theorem  (or 
Renewal  Theorem)  for  even  the  long  run  (expected)  value. 
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4.2  SMC (9)  and  FR(N;2).  The  model  which  will  be  used,  8MC(9),  la  a 
modification  of  SMC (2)  and  is  portrayed  in  figure  5, 

Figure  5 

CSP-12  and  SMC (9) 


The  transitional  matrix  of  the  embedded  MC  la 


l-ql  0 


where  6 * d(l-q^)  and  r ■ l-6q*. 

The  matrix  entries  are  obtained  from  the  transformed  pdf's  given  in 
Theorem  16.  SMC(9)  is  an  irreducible  SMC 
Proof.  The  transformed  pdf's  are 
Qao  * qI<*-q)/«(2) 

Qoa  “ S’/**  Qoo'  * «q:. *>  and  Qob  " 0/* 

Oo'a*  ^/(z-Sq1)  ana  Q0'b  * B/(z-6qr) 

Qj,g  - ?/<a-6)  and  Qbo'  - 5qI/(z-B). 


The  mean  holding  times,  obtained  from  the  derivatives  of  the 
transformed  pdf's,  are 

Wa  " ~3“  , Wo  ■ 1.  wo'  " . and  Wb  - '£  • 

Using  the  matrix  given  after  Figure  5 to  solve  the  usuel  eigen  value 
equation,  for  the  stationary  vector  e,  yields  the  system  of  equations 
given  below. 

6e0  + — eo»  + (l-q1)eb  - ea 

•a  " eo 

5qIeo  + qJeb  * eo« 

6e0  + f e0'  " eb 

(where  r ■ l-dq1). 

Solving  the  system  gives 

®a  " ®0 


Again  we  use  the  fact  that  the  components  of  the  stationary  vector  add 
to  one.  Using  the  equation  which  expresses  this  fact,  together  with  the 
last  three,  gives 

ea  " e0 

- (l-qx)/G 

e0.  - qI(l-6qI)/G 
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and 


aj,  ■ 8/G 


where 

G - (1-q1)  + (l-fiq1)  + 0. 

We  flniah  Che  proof  by  translating,  into  Englieh  text,  what  the 
traneltlona  naan  In  SMC (9);  we  will  write  "etate  x goaa  to  etate  y"  as 
"x  to  y".  0 to  b if  no  defect,  0 to  O'  if  a defect  is  found  but  DS1 

finds  none,  0 to  a if  a defect  is  found  and  DSI  finds  one  or  more,  O'  to 
a if  a defect  is  found  and  DSI  finds  one  or  more,  O'  to  b if  unit  is  either 
not  inspected  or  Is,  and  found  non-defective,  and  O'  to  O'  (remaining  in  O') 
if  a defect  is  found  but  DSI  finds  no  defects.  The  transition  O'  to  O'  is 
"Internal"  - that  is.  O'  has  no  self  transitions  and  is  consequently  a non- 
trivial SMC  state  (see  its  pdf  above  and  Chapter  1,  section  5). 

We  are  now  ready  to  define  the  principal  functional  in 

Definition  6.  Given  the  model  SMC (9)  for  CSP-12,  the  functional 
Fraction  of  Repetitions  is 

, , £ C0i(k) 

FR(t)  . 


The  definition  of  FR(t)  is  motivated  by  the  comments  made  at  the  end  of 
the  proof  to  Theorem  16.  In  addition,  we  remark  that  minus  one  appears 
since  the  inspection  process  begins  in  state  a and  the  summation  appears 
for  O'  since  self  transitions  are  not  allowed.  For  infinite  t,  FR  has  the 
value  given  in 

Theorem  17. 


Lim 

t-H» 


FR(t) 


(l-qljy'j 


»’2 


(a.e.J 


where  w'i  and  vfe  are  defined  in  Theorem  1. 
Proof . From  Definition  6, 


\ 


\ 


463 


S2  "<«>  • E (“r1)  * E 


+ o»0 1 , [a.*.] 


by  Cha  Strong  Ergodlc  Thaoraa.  From  Thaoram  16  and  A.  25,  wa  hava 


a*  (l-q*j) 

^a  u\  (l**q^)+(l"q^)+q*+  $/4 


y\(l-q1)+w^ 


y,l(l-qI)+  y*2 


Adding  tha  two  axprasalona  flnlahaa  tha  proof. 

Since  (I) * (tFR(t) ) can  be  regarded  as  tha  degree  of  Inspection 
overlap,  we  are  led  to  define  a variant  of  FR(t)  in 

Definition  7. 


FR’(t) 


I(tFR(e))  + t * 


Concerning  this  functional,  we  have 
Theorem  18. 

Lim  FR'(t)  - 2 L“ 2 

t-H»  I + (l-ql)^  +•  u‘2 


, [a.e. ] 
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Proof. 


From  Definition  7, 


Llm  1 

t-H.  ft)  (PR(t))+l 


■ the  result. 

K 

i 

4.3  Expansions  and  Ext *n» Iona.  Another  possible  treetment  of  DS1  Is 

the  expansion  of  MRP  (snd  SMC)  models  to  "transition  stste"  models.  We 

will  work  here  only  with  MRP's.  •! 

i 

Given  e MRP  (Y.  U)  es  in  A. 19.  we  een  easily  prove  that 

- t|Yn_i  - 1 end  Yn  - j]  - (D1> 


where  Tn  • Un  - Un«i.  Prom  A. 19  aud  Eq.  Dl,  we  can  also  easily  show  that 
)((Ya,  Yn+i> , Un)/n  varies  over  the  net'l  nos.f  (D2) 


Llm  PR’(t) 
t+* 


by  Theorem  17, 


is  a (derived)  MRP  whose  pdf's  are  given  by 

p[<*n»  Yn+1>  * Cl*),  Tn  - t | (Yn_!,  Yn)  - (i,j)] 


“ qlk6jl  <D3) 

We  name  the  MRP  given  by  expression  D2  and  simplify  notation  in 

Definition  8.  The  MRP  given  by  D2  Is  called  the  Expanded  MRP.  Its 
pdf's  are  given  by  Eq.  D3  and  denoted  by 
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& 


l 


Q(U)(Ak)<t>* 


Such  a derived  process  can  automatically  keep  track  of  transitions, 
thalr  number  and  type,  in  tha  parent  process.  Thus,  for  example, 
PR(«|2)  could  be  defined  (and  evaluated)  on  "expanded"  NIP (2)  as  given 
belov. 

Theorem  19.  Expanded  MRP (2)  is  a MRP. 

Proof.  Proa  Definition  8 and  Theorem  2,  the  transformed  pdf's 
are  (dropping  the  argument) 

Q(l2)<22)  * flX0l2 


Q<ia>(2i) 
4(22) (22) 

$(22) (21) 


(l-q1)Qia 

$22 

d-qI)^22 
' q1 


Q(21)(12) 


(l-ql)  * 


Letting  1 ■ 1 in  the  above  equations,  we  get  the  transitonal  matrix 
of  the  embedded  MC 


(12) 

(22) 

(21) 

(12) 

0 

q1 

1-q1 

(22) 

0 

q1 

l-ql 

(21) 

1 

0 

0 

Using  the  matrix  to  solve  for  the  components  of  the  stationary 
vector  gives 


! 

1 


46b 


ir 


*■ 


*•  7 


•(12)  • (l-q1)/©,  *(22)  • q1/©,  and  #(2l)  • a(12) 


vhere 


C ■ 2-q1. 


Daflning  pjj  aa  cha  naan  holding  tine  till  tranaitlon  ta  4 

°,t"‘  - jyss 3 


‘'(lj)  * <*»ij  I qjk)/qij 


. " wij/qij 

Applying  Eq.  19.1  to  tha  tranaforned  pdf* a ylalda 


(19.1) 


W(12)  - Wl/q12 

“(22)  * U22/922 

“(21) 

■ P'l/l 

- U22/qX 

- u\ 

■ vfc 

Wai/d-q1) 


dirlv.d^fro.)  Z,™  «.  d.fin.d 

Dafinltion  9.  For  Expanded  (MRP (2)), 


FR(t;2)  - (!)  lW(22)(t)^N(2n(t)) 


ZhfQgya  20.  For  FR(t}2)  In  Definition  9, 
1 


Llm  FR ( 1 1 2 ) • 

(l-q^w'l  + p'a 


[a.e.  ] 
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1 

E[T] 


where  E[T]  ie  given  In  Fropoeition  1. 

Proof.  Theorem  19  end  Definition  9. 

Ve  eloee  thie  chapter  by  ahowing  that  SMC (9)  cannot  be  collapsed 
into  any  of  the  other  models  for  CSP-12.  Any  collapsing  would  require 
that  the  ordered  ensemble  S * (0,0')  be  a macrostata  as  defined  in 
Chapter  1.  However,  entrance  from  state  a or  b would  require  the  pev 
to  be  (1,0)  or  (0.1),  respectively.  If^we  picked  the  former  pev  and 
formally  defined  QbS  to  be  the  same  as  Qbo'«  the  Backward  Equation  system, 
for  SMC(9)',  say,  would  not  hold.  For  example,  if  S were  a macrostate, 
then,  letting  S ■ d,  the  equations 

p«b<t>  ■ <W«db<‘> 


and 

W10  " Qbd*pdb^>  + Qba*pab<t>  + Vfc> 

would  have  to  hold.  However,  entrance  to  d from  state  a results  in  a 
greater  probability  for  a given  holding  time  in  d than  an  entrance  from 
state  b.  Consequently,  Pj^t)  is  not  well  defined. 

Another  way  of  stating  this  inconsistency  is  provided  by 

Definition  10.  Let  Pxy(t}w)  be  the  Fundamental  Probability  Function, 
from  x to  y,  given  that  entrance  into  x is  from  w. 

Then  consistency  requires  that  Pxy(t;w)  be  independent  of  state  w. 
However,  for  SMC(9)', 

Pdb<t;t>)  * Pdb(tja) 

Similar  results  are  obtained  if  we  pick  (0,1)  as  the  pev  and  define 
Qad  formally. 

Under  certain  conditions,  we  can  still  reduce  a MC  to  a SMC  in  the 
case  that  the  relevant  probability  functions  are  indexed  by  ensembles  of 
MC  states  as  occurs  in  SMC(9) ' . The  dependence  of  the  probability  functions 
on  the  entrance  ensemble  is  equivalent  to  the  dependency  of  the  pev's.  He 
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therefore  drop  the  restriction  of  pov  independence  by  using  v(*;y) 

Co  devote  tho  pov  of  tho  onsoablo  x glvon  on  ontroneo  from  y.“  Further- 
moroi  olnco  vj(x;y)  bolng  loro , for  o glvon  MC  stoto  J , con  imply  that 
j cannot  bo  reached  from  any  othor  atatoa  in  x,  x itaoif  bocomoa  o 
function  of  yi  x ■ x(y).  Furthor  dependence  is  handled  by  dropping 
tho  lnnor  parenthesis:  for  example,  x(y<v)>  ■ x(yv) . Lotting  a,  b,  c, 

d,  ...  bo  (disjoint)  ensembles  of  MC  stotoo  which  wo  wish  to  transform 
into  macrostates,  wo  make  a provisional  definition  for  tho  holding  time 
pdf's  in 

Definition.  Given  a,  b,  c,  and  v(aic) 


j 

where  j varies  over  the  eot  a and  B ia  the  absorbing  "state"  correeponding 
to  b. 


Given  the  underlying  MC,  M( ■ ) i the  above  Definition  will  yiold  a SMC 
iff  (letting  R(i  ■ M(Un),  Un  being  tho  elapaad  time) 

PlRn+l  in  b(ac*<>)|Bn  ln  «(<!•••)»  Rn-i  in  c(d**0,  •••Ro  in  y] 

■ R(Riri-l  in  bCaJlRa  in  o(c)  ] 

“ P((Rn.  R«+l)  * («.b),  - t|  (Rq-1,  Rn>  ■ <«»  *>1 


1 
i 

i 


where  Tn+1  ■ Un+i*-U_ . Thus  v(b,  a(c*  • • ) ) ■ v(b{a)  and  Tn+i  depends  only 
on  Qab(  »c).  Thorerore,  it  Ts  necessary  and  sufficient  to  require  that 
a(c)  include  all  the  states  of  a which  communicate  with  the  states  of  all 
other  ensembles,  (for  all  a,  c)  since  v(b;a)  depends  only  on  the  one  step 
MC  traneltlonal  probabilities.  In  particular,  it  is  sufficient  that  a(c)  ■ 
a,  for  all  aets  a and  c. 

Under  the  ab)ve  necessary  and  sufficient  condition,  we  can  now 

write 


Pad<fcic>  “ I Qab(  ic>*I»bd<  ;a)(t) 

b 

+ 4a,dJa<tSc>- 


' i 
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B 


t 


" V 


From  another  point  of  view,  vs  can  also  lat  a(c)  danota  tha  atata 
ic))»  x varying  over  tha  exit  atataa.  Ualng  thla  latter 
notation,  wa  oan  aat 

Qab<t»c)  • Qa(c),b(t). 


for  a given  MC,  tha  raaultant  number  of  atataa  may  be  email  enough 
to  warrant  SMC  reduction,  In  tha  above  eaee  of  dependent  pev'a.  If  tha 
reduction  in  complexity  la  aubatantlal  enough.  Thla  extended  SMC  re- 
duction can  be  applied  to  SMC(9);  S (a)  ■ the  ordered  aet  (0,0')  and  S(b)  * 
(O').  However,  nothing  la  gained  here  alnce  we  at ill  have  4 atataa. 

In  cloalng  thla  chapter,  we  point  out  yet  another  deviation  from  the 
condltlona  of  a atata  independent,  etatlonary  pev.  The  deviant  condition 
can  be  found  In  [6.2,  Chp.  5].  Tha  type  of  pev  found  there  la  an  Initial 
pev  uaed  in  the  arbitrary  entry  caae  of  CSF'a.  It  la  ahown  that  tha 
exlatance  of  theae  pev'a  la  equivalent  to  that  of  Initial  (or  delayed) 
holding  time  pdf 'a  In  the  atationary  (or  random  entry)  caae  for  argodic 
SMC' a.  Thua,  thla  apecial  type  of  pav  la  handled  in  a manner  analogoua 
to  that  uaed  for  atate  dependent  pev'a  - aa  an  "index"  (given,  in  the 
paper  cited,  by  a prime  over  the  Q'a). 


i-1 

1 

i ; 

J 

i 

i 
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5.0  CONCLUSION. 

5.1  Summery.  Two  approachee  to  the  DSI  modification  of  CSP-11  are 
considered  In  Chaptera  2 through  4.  The  firat  approach,  found  in 
Chaptara  2 and  3,  Ignores  any  overlap  in  the  Inspection  process  by  using 
a functional,  defined  on  a new  DSI  model,  to  count  only  the  additional 
units  which  are  inspected  from  sampling  phase  segments  - units  which 
would  otherwise  not  be  inspected  under  CSP-11.  Since  the  functional  TFI 
is  not  sufficient  to  deal  with  all  the  important  aspects  of  CSP-12,  a 
second  approach,  found  in  Chapter  4,  uses  a new  functional,  defined  on  a 
slightly  different  DSI  model,  to  take  account  of  inspection  overlaps.  In 
either  treatment,  there  is  no  explicit  backtracking  in  operational  time 
Itself;  both  approaches  incorporate  the  time  shift  into  the  transitional 
changes,  Induced  by  DSI,  which  are,  in  turn,  incorporated  in  the  pdf's  of 
the  underlying  models.  Throughout  the  paper,  variations  in  functionals  and 
uimpling  plans,  together  with  comparisons  of  them  with  the  primary  objects 
of  study  are  also  considered. 

5.2  Methods  Used.  Two  principal  tools  are  used  in  the  analysis  of  DSI: 

SMC  (and  MRP)  reduction  and  the  s-transform.  Since  the  SMC's  constructed 
for  the  analysis  are  modifications  of  the  SMC  model  of  CSP-11,  the  process 


i 
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of  constructing  s SMC  clsss  from  s MC  model,  dsoerlbsd  in  Chapter  1, 
is  turned  around.  In  Chapter  A.  the  importance  of  the  probability 
entrance  vector  (pev)  is  brought  out  by  the  incompatibility  of  SMC (9) 
with  the  other  CS9-12  models.  Also  in  Chapter  A,  the  use  of  aft 
Expanded  MRP  in  the  ana ly a is  of  DSI  is  illustrated;  this  kind  of 
analysis  could  be  elaborated  on  for  further  Investigation  of  functionals 
dependent  on  a sequence  of  transitions. 

Ve  conclude  this  paper  with  the  observation  that  DSI  can  be  used 
to  modify  the  more  complex  CSP'e  described  in  Chapter  1. 


APPENDIX 


A.O  SEMI  MARKOV  CHAINS.  Given  that  X(* ) la  a time  honogenaoua,  aperiodic, 
irreducible  or  absorbing,  and  finite  state  Semi  Markov  Chain  (SMC)  with 
•tats  space  S,  the  following  notation  and  statements  are  used  in  the  body 
of  the  text  [cf.,  6.7,  6.10,  6.14,  and  6.15]. 

A.l  Notation  and  Definitions.  For  1,  j,  k,  l in  S: 

1.  Qik(t)  - P[X(t)«k,  X(t')-i,  0 < t*  < t |x(0)*i] .* 

This  function  is  the  (defective)  pdf  of  the  time  of  sojourn  in  state  1 
until  a transition  is  made  to  state  k (for  discrete  t and  i ^k). 

2.  Plk(t)  - P[X(t)-k|x(0)-i]. 

This  function  is  the  fundamental  probability  function  of  the  SMC 
for  (i  to  k). 

3.  Flk(t)  ■ P[X(t)-k;  X(t')  *k,  0 < t*  < t |x(0)-i] . 

This  function  is  the  first  entrance  probability  function  for  (i  to  k) . 

4.  Jk(t)  - Ho*(«0-Z  QkAXt). 

A 

This  function  is  the  probability  of  not  leaving  state  k by  time  t. 

5.  Un(k)  la  the  time  of  nth  entry  Into  k. 

6.  Nk(t)  - Max  j n/Un(k)  i t} 

This  random  variable  is  the  renewal  function  for  state  k. 

7.  Un  is  the  time  of  n^h  entry. 

8.  Y(n)  ■ X(Un)  is  the  embedded  Markov  Chain  associated  with  the  SMC. 


*Thls  definition  corrects  statement  3,  definition  5 in  [6.2,  p.  664]. 


472 


... 


For  the  case  where  self  treneltione  are  allowed,  we  can  uae  the 
■ymbole  above  to  define  a Markov  Renewal  Proceaa  (MRP) . 

9.  A MRP  la  the  ordered  pair  (Y,  U)  such  that,  for  states  1,  k In  S, 
*IV*.  Tn«t|Yn_i"l,  Yn_2,  ....  Yq J Tn_x,  Tn_2,  •••,  T0] 

- PlYn«k,  Tn-tlYn.i-1],  Tn  - Un-Un-! 

■ Qik(t). 

(Note  that  this  pdf  Is,  In  general,  different  from  that  defined  in  A. 11.)* 

10.  The  SMC  X(t)  associated  with  a MRP  is  defined  by 
X(t)  - Y (t) 

" YN(t) 

where  N(t)  ■ I Nj  (t) , j in  S. 

A.  2 Statements. 

1.  By  time  homogeneity  and  the  method  of  first  entrance,  we  have  the 
Backward  Equations: 

Pik(t)  - I Qij*Pjk<t>  + <*ik>Jk<t>- 

5 

2.  Pik(t)  - Flk*Pkk(t)  + (5lk)Jk(t). 

3.  If  qlk  ■ Ho*Qik(+  •). 

T - tqik] 

is  the  transitional  matrix  of  Y. 

*This  definition  corrects  that  given  in  [6.2,  p.  695]. 
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4.  II  X la  Irreducible,  the  aquation 


jtf  - a 

has  a uniqua  normalised  solution  called  the  stationary  vector  of 
tha  SMC. 

ekUk 

3.  Lla  Plk(t)  - -*-S- 

***  ZV* 

1 


- ak  (or  Pk(«0) 

vhere  pk  la  tha  mean  time  of  sojourn  In  state  k and  tha  a^'a  are  the 
components  of  a,. 

6<  lu  (?)  ■ »k  

7.  (Strong  Ergodlc  Theorem.)  If  W la  a functional  defined  on  the  SMC, 
ve  have,  aa  N approaches  Infinity, 

^ l W(X(s))  approaches  E0[W]»  [a.e.] 


- I W(k)ak. 
k 


In  the  case  of  self  transitions,  we  have 

8.  If  (Y,  U)  is  a MRP  such  that  q^  < 1,  the  unique  SMC  induced  by  the  MRP 
has  Ita  pdf's  given  via  (i  * j) 


$Jj  ■ . 1*  111  > 0 

i-Qu 

A 

• Qij , otherwise 


whore  the  Q's  are  given  by  A. 19.  It  Is  equivalent,  almost  everywhere, 
to  the  associated  SMC. 

9.  The  properties  of  time  homogeneity,  irreducibility , and  aperiodicity 
are  preserved  under  filtration. 
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PROGRESSIVELY  CP'SORED  SAMPLING  IN  THE 

THREE  PARAMETER  LOG-NORMAL  DISTRIBUTION* 


A.  Cliiford  Cohen 

The  Un Ivors ity  of  Georgia 
Athens,  Georgia 

SUMMARY 


This  paper  is  an  extension  of  previous  work  by  the  writer  con- 
cerning progressively  censored  sampling  in  the  normal  distribution  [4] 
and  in  the  Weibull  distribution  [6].  Here  local  maximum  likelihood 
estimators  and  estimators  which  utilize  the  first  order  statistic  are 
derived  for  the  three-parameter  log-normal  distribution  whon  samples 
are  progressively  censored.  An  illustrative  example  involving  life 
test  data  is  included.  Various  properties  of  the  proposed  estimators 
are  investigated. 

KEY  WORDS 

Log-normal  Distribution 
Progressively  Censored  Samples 
Life  Testing 

1,  INTRODUCTION 

Progressively  censored  samples  frequently  occur  in  life  and  fa- 
tigue tests,  where  individual  observations  are  time  ordered  and  where 
at  various  times  during  a test,  some  of  the  survivors  are  removed 
(i.e.  censored)  from  further  observation.  Samples  of  this  type  from 
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the  normal  and  from  the  exponential  distribution  have  received  previous 
attention  from  Herd  [10],  Roberts  [18],  and  the  writer  [4].  Progressively 
censored  samples  from  the  two-parameter  WeibulJ  distribution  were  con- 
sidered by  the  writer  [5]  and  by  Ringer  and  Sprinkle  [17].  More  recent 
work  by  the  writer  [6]  deals  with  progressive  censoring  in  the  three- 
parameter  Weibull  distribution.  The  present  paper  is  concerned  with 
progressive  censoring  in  the  three -parameter  log-normal  distribution. 


2.  THE  SAMPLE 


Let  N designate  the  total  sample  sire,  and  n the  number  which  fail 
and  therefore  result  in  completely  determined  life  spans.  Suppose  that 
censoring  occurs  in  k stages  at  times  T^T^,  j»l,  2,  ...,  k,  and  that 
surviving  items  are  removed  (censored)  from  further  observation  at 
the  jth  stage.  Thus 


N ■ n ♦ Ej  r j . 


(1) 


Two  types  of  censoring  are  generally  recognized.  In  Type  I censoring, 
which  is  of  primary  interest  here,  the  T^  are  fixed,  and  the  number  of 
survivors  at  these  times  are  random  variables.  In  Type  II  censoring, 
the  number  of  survivors  are  fixed  and  the  T^  are  random  variables.  In 
both  types,  the  r^  are  either  fixed  or  determined  independently  of  the 
life  span  X.  The  observations  x^  are  ordered  according  to  magnitude. 

The  likelihood  function  1(S),  where  S signifios  a k-stago  Type  I 
progressively  censored  sample  of  the  type  described,  is 

l(S)  • C n f(x.)  n [1  - FCT.)]1*,  (2) 

i«l  1 j-1  J 

in  which  C is  a constant  while  f(x)  and  F(x)  are  density  and  distribu- 
tion functions  respectively. 
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3.  THE  LOG-NORMAL  DISTRI  BOTKIN 

We  write  the  density  function  for  the  three-parameter  log-normal 


distribution  as 


f(x;  P»  o,  y) 


1 ,-[ln(x-r)-uI2/2a2>Y  < x « 


» * r w w ’ r (I  * » » 

0/2lT  (X-Y) 

■ 0,  elsewhere. 

This  distribution  derives  its  name  from  the  fact  that  when  the  random 

2 2 
variable  X is  lognormal  (u,  o » y)  » then  Y ■ ln(X-Y)  is  normal  (y,  o ). 

The  mean,  median,  mode,  variance,  coefficient  of  variation,  B^  and  Bj 

(Pearson's  Betas)  for  this  distribution  (c.f.  Yuan  [23])  are 


yx  » Y + e'W, 

Me  ■ y + ey, 

Mo  ■ y * e^/to , 

V(x)  - e2M  U)(W-1), 

c v ■ /ITT, 

Bj  ■ a2  » (w+2)2  (w-l), 

4 3 2 

®2  " a4  a w '*’^w  + 

where 


w ■ e , (5j 

and  where  and  a4  denote  the  third  and  fourth  standard  moments. 

The  coefficient  of  variation  about  the  left  terminus  is  defined  as 

C V - /V00  / (yx-Y)  • (6) 

Previous  investigations  by  the  writer  [3],  Aitchison  and  Brown  [1], 
Hill  [11] , Wilson  and  Worcester  [2l],  and  others  have  dealt  with  maximum 
likelihood  estimation  in  tho  throe  parameter  log-normal  distribution  when 


479 


samples  are  complete.  Harter  and  Moore  [9]  considered  local  maximum 
likelihood  estimation  in  the  three  parameter  log-normal  distribution 
for  singly  and  doubly  censored  as  well  as  for  complete  samples.  Hill 
examined  some  unusual  features  of  the  likelihood  function  of  this  dis- 
tribution which  had  apparently  escaped  tho  notice  of  earlier  investi- 
gators. He  demonstrated  the  existence  of  paths  along  which  tho  like- 
lihood function  of  any  ordered  sample  x^,  ....  tends  to  • as 
2 

(y,  u,  o ) approach  (Xj,  •). 

This  global  maximum  of  the  likelihood  function  thereby  loads  to 

A A AA 

the  inadmissible  estimators,  y * Xj,  v * -»  and  cj  ■ • regardloss  of 
the  sample.  On  the  other  hand,  when  wo  equate  partial  derivatives  of 
the  log-likelihood  ftaction  to  zero,  solution  of  these  equations  leads 
to  local  maximum  likelihood  estimates  which  in  most  cases  are  reason- 
able and  as  noted  by  Harter  and  Moore  (loc.  cit.)  appear  to  possess 
most  of  the  desirable  properties  ordinarily  associated  with  maximum 
likelihood  estimators.  Exceptions  may  occur  in  small  samples  for  which 
the  likelihood  function  fails  to  exhibit  a clearly  dofined  local 
maximum. 

4.  LOCAL  MAXIMUM  LIKELIHOOD  ESTIMATION 

With  the  p.d.f.  as  given  in  equation  (3),  the  logarithm  of  the 
likelihood  function  (2}  becomes 

InL  ■ -nlno  - z”ln(x^-y)  - -nj-  Ej  [lnCx^-v) -w3 2 

+ ijr^  ln[l-Fj]  + In  C.  (7) 


r?.. . C-  'K  - •'ri  i! 


1 


Local  maximum  likelihood  estimators (LMLE)  are  obtained  by  simul- 
taneously solving  tho  estimating  equations 

3 F. 


3 V 


3 ln  L - ij-  rJ[ln(xi-Y)-u]  - ^(7^7)  m °» 

Q 


1 


r . 


3 o 


8 In  L _ -n  1 »>n»,  * , i2  3 % 3 j « 

T “T  C1[in(xi-Y)“M]  - ^(yrr5  FT"  °* 

o J 

a f. 


(8) 


In  I n 1 In  ln(Xa-y)-u  v Ta 

■p  ■ *?<£»»  * Jr  £1(-  ->  ■ 


3 Y 


0. 


Let 


where 


<K5J 

zi " Z(V  * rmp*  * 


(9) 


Fj  • F(T  ) - r ^ f(x)dx  ^g(y)dy  •J'  d> C*) dz  ■ 


F(Cj) i 


(10) 


in  which  f(x)  is  given  by  (3),  g(y)  is  the  normal  density  (u,  o ),  $(z) 
is  tho  standard  normal  density  (0,1),  and 

(11) 


y^  « ln(T^-Y).  whereas  » (y.j-)j)/c. 


It  then  follows  from  (9), (10)  and  (11)  that 

8 F,  o 3 F| 

- ■ "2, , (m 

j 


(T7J)  Tv 


"2j  • (Trf*  r^s  “Vr and  (ttfj5  ft1 


Tp?  * 


(12) 


When  the  results  of  (12)  arc  substituted  into  (8),  the  estimating 
equations  bccomo 


\ 


481 


lJ[ln(xi-Y)-w]  ♦ olfr^ 


- 0, 


tJflnCxj-Yj-Mj2  ♦ ®2IsirJ«jZj-n]  - 0, 


(13) 


ln(x.-Y)-M 

tjr  1 


TtZ. 


r-<  1 * ■ 


Vv 


0. 


Various  iterative  techniques  are  available  for  simultaneously 

A A A 

solving  these  three  equations  for  the  required  estimates  y,  c,  and  y, 

A procedure  that  has  performed  quite  well  for  the  writer  involves 
selecting  a trial  value  Yi  for  y»  solving  the  first  two  equations  with 
Y«Y^  for  and  using  the  standard  Newton  technique  (e.f.  page  90 
of  reference  [20]),  and  then  substituting  these  values  into  the  third 
equation  of  (13).  Once  two  values  ya  and  Yj  have  been  found  such  that 
the  absolute  difference  l^-Yjl  i*  sufficiently  small  and  such  that 
HCl^.y^op  > o > H(Yj,yj,Oj),  where  H(y, y,o)  designates  the  left  side 
of  fite  third  equation  of  (13),  the  required  estimates  follow  by  linear 
interpolation.  The  smallest  sample  observation,  Xj , is  of  course  an 
upper  bound  on  y and  may  thus  be  employed  as  a first  approximation  y1 
in  the  iteration  procedure. 

In  the  event  that  the  third  estimating  equation  of  (13)  is  not 
satisfied  for  any  value  of  y in  the  permissible  interval  y < then 
the  modified  estimatoaof  Section  5 are  to  be  recommended. 

Harter  and  Moore  encountered  the  related  problem  in  connection 
with  samples  that  ar<*  singly  and  doubly  censored.  With  r observations 
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censored  on  the  left  so  that  xy  is  an  upper  bound  on  > , their  recom- 
mendation is  that  an  additional  observation  be  censored  on  the  left  so 
that  then  becomes  a new  upper  bound  on  y. 


5.  MODIFIED  MAXIMUM  LIKELIHOOD  ESTIMATION 
Alternate  estimators  (MMLE)  which  have  proven  most  satisfactory 
in  numerous  applications,  can  be  obtained  by  simultaneously  solving  the 
estimating  equations 


fP  . o,  - 0,  and  E[F(xr)]  - F(xr), 

where  X^,  is  the  rth  order  statistic  in  a sample  of  size  N.  Only  those 
failures  which  occur  prior  to  the  time  at  which  the  first  stage  of 
censoring  takes  place,  provide  observed  values  for  order  statistics, 
and  thus  the  maximum  value  of  r is  limited.  In  most  applications,  we 
might  choose  to  set  r>l,  but  a larger  value  might  be  preferred  if  there 
is  reason  to  suspect  contamination  of  the  sample  data  in  the  vicinity 
of  the  terminus.  Applicable  estimating  equations  accordingly  consist 
of  the  first  two  equations  of  (13)  plus  a third  equation  involving 
as  derived  below.  Since 

F(V  ■/  rf(x)dx,  and  since  E[F(xr)J  » , (14) 

Y 


it  follows  that  our  third  estimating  equation  becomes 


Y • Xy 


- e 


v +o*. 


(IS) 


whore  K is  the  standard  normal  deviate  for  which 
r 


{v,2'2dt 


r 

R7T* 


(16) 
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The  modified  estimators  accordingly  aro  found  by  simultaneously  j 

solving  the  set  of  equations  consisting  of  tho  first  two  equations  of  ' 

(13)  plus  equation  (15).  The  same  procedure  employed  in  Section  4 to  f 

calculate  the  LMLE  is  also  applicable  here.  On  determining  and  Yj 
such  that  | y . -y  . | is  sufficiently  small  and  such  that  G(y,,u.,o.)  > j 

£ G(y^,u^,o^),  where  G(y,u,o)  « y + e ”,  we  interpolate  for  the 

roquired  estimates  just  as  we  did  in  Section  4.  j 

6.  SOME  SPECIAL  CASES 

Various  special  casos  in  which  at  least  one  of  the  parameters  is 
known,  are  of  interest  in  certain  applications.  The  following  are 
considered  to  be  deserving  of  mention  at  this  timo. 

MLE  with  y known. 

With  y known,  there  is  no  .longer  any  distinction  to  be  made 
between  a local  maximum  and  a global  maximum.  The  applicable  estimating 
equations  in  this  case  are  the  first  two  equations  of  (13),  and  they  may 

A * 

be  solved  iteratively  for  the  required  estimates  u and  o as  outlinod 
in  Section  4.  As  on  alternate  technique,  we  might  make  the  transformation 
y^  « ln(Xj-y)  and  then  proceed  as  described  in  reference  [4]  for  a 
progressively  censored  sample  from  a normal  distribution.  Gajjar  and 
Khatri  [7]  previously  considered  this  special  case. 

LMLE  with  a known. 

It  often  happens  that  the  shape  parameter  o and  thus  are  known, 
leaving  only  w and  y to  be  estimated  from  the  sample  data.  In  this  case, 
the  applicable  estimating  equations  consist  of  the  first  and  third 
equations  of  (13).  * 
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W4LB  with  o known. 


In  this  case,  the  applicable  estimating  equations  consist  of  the 
first  equation  of  (13)  plus  equation  (IS). 


7.  ESTIMATE  VARIANCES  AND  COVARIANCES 

A A 

The  asymptotic  variance- covariance  matrix  of  the  estimators  a,  e, 

A 

and  y is  obtained  by  inverting  the  information  matrix  in  which  elements 
are  negatives  of  expected  values  of  the  second  partial  derivatives  of 
the  logarithm  of  the  likelihood  function.  For  sufficiently  large 
samples,  these  expected  values  can  be  approximated  by  substituting  the 
estimates  obtained  from  a given  sample  directly  into  the  partial  deri- 
vatives which  are  given  below. 


3 In  L _ -n  1 — •»/»*  % 

77 7'7  1 j j j1- 


^>cvW  • yJvjV2  * VYV1 


) In  I „ 1 


n [tn(x.-Y)-y-l+o  j k r,Z.[o-(Z  -«.)] 

£? 1 , + e?  * ■ J... ■■■  

1 (xrrr  1 (^-Yr 


32  In  L 32  In  L 


ay8Y 


a2  In 
So  S'y 


3y3v 


0*  „2  1 (Tj-y) 


(17) 


L S2lnl  ! .n  1 

- ■ sfss 7 Ei  — ■ 7 h (t  -vj 


>2  In  L S2  In  L 


aydo 


3o3y 


- - ^E’J[ln(xi-Y)-y]  - ^ z\ 


a~  " o 

A A A 

Since  the  estimators  y,  a and  y are  local  rather  than  global 
maximum  likelihood  estimators,  the  applicability  of  the  variance-covariance 
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matrix  obtained  hero,  might  be  open  to  question.  However,  a Monte 
Carlo  study  by  Nicholas  Norgaard  [16]  indicates  that  the  approximate 
asymptotic  variances  and  covariances  obtained  here  should  be  considered 
satisfactory  when  n > 50,  although  they  might  be  misleading  as  measures 
of  sampling  error  for  small  samples.  Norgaard's  results  are  consistent 
with  results  of  an  earlier  Monte  Carlo  study  by  Harter  and  Moore  (loc. 
cit.)  in  connection  with  singly  and  doubly  censored  samples.  It  is 
also  to  be  noted  that  Norgaard's  study  indicates  that  variances  and 
covariances  of  the  MMLE  are  approximately  equal  to  corresponding 
measures  of  the  MLE.  This  is  an  area  of  investigation  that  is  continuing 
to  receive  attention  both  from  Norgaard  and  the  writer. 


8.  AN  ILLUSTRATIVE  EXAMPLE 

A simulated  life  test  was  conducted  on  100  randomly  selected 
units  of  a certain  electronic  device  having  a log-normal  life  span 

A 

with  v * S.0000,  o ■ 0.3000  and  y ■ 100.  Sixty-five  complete  life 
spans  were  observed,  while  thirty-five  observations  were  censored  in 
three  separate  stages.  Following  are  the  life  spans  in  hours  to  two 


places 

of  decimal, 

for  the  65 

units  which 

failed  during  tho  test. 

167.91 

200.88 

219.14 

232.91 

246.61 

262.59 

287.71 

17S.83 

201.76 

220.59 

235.66 

247.17 

263.94 

288.81 

185.88 

205.31 

222.00 

236.75 

249.14 

266.12 

291.30 

188.14 

206.98 

222.82 

237.40 

249.73 

266.62 

295.18 

189.08 

210.78 

224.33 

239.05 

250.09 

267.01 

297.38 

191.96 

212.49 

225.60 

240.22 

252.89 

270.64 

195.61 

213.24 

226.50 

240.64 

253.57 

271.76 

197.01 

215.25 

227.24 

242.17 

255.57 

275.48 

198.76 

216.75 

227.24 

243.03 

260.60 

279.62 

199.05 

218.78 

231.42 

244.56 

261.99 

285.19 
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When  the  tenth  failure  occured  at  time  Tj  ■ 199.05,  twelve  units 
■elected  at  random  from  the  survivors  were  censored  (i.e.  removed  from 
the  test).  When  the  forty-fifth  failure  occured  at  time  Tg  ■ 250.09, 
ten  additional  randomly  selected  survivors  were  removed,  and  the  test 
was  terminated  at  time  Tj  ■ 297.58  with  13  survivors.  In  summarising 
these  data,  we  record:  N ■ 100,  n ■ 65,  ■ 35,  ■ 167.91,  Tj  ■ 

199.05,  tx  • 12,  T2  ■ 250.09,  r^  ■ 10,  Tj  • 297.38,  r3  ■ 13,  - 

15,327.43,  ?<e  . 235.8066. 

Estimates  were  calculated  as  described  in  Sections 4,  S and  6 
and  are  summarized  in  the  following  table. 

In  general,  the  estimates  obtained  here  compare  favorably  with 
corresponding  population  parameters. 
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ABLE  1 - SIAMARY  OP  ESTIMATES 
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